Count of specific character in a text

**FishVal** · Feb 11 '09, 07:11 AM

Fathers,

Actually (now it is related to original question) I would not use a query with 36 calculated fields. Characters to count (in Klingon, Esperanto or even in Shakespare's Talk) could be stored in an additional table. This table could be joined using cartesian join with the target table to produce all possible combinations of [text] and [character] which are used to calculate character ocurence count. Thus obtained dataset could be used as a source for crosstab query.

Query: qryCount

Code:

SELECT tbl.txt, tblChars.txtChar,
Len(tbl.txt)-Len(Replace(tbl.txt, tblChars.txtChar, "",1,-1,1)) AS lngCount
FROM tbl, tblChars;

Query: qryCrosstab

Code:

TRANSFORM First(qryCount.lngCount) AS lngCount
SELECT qryCount.txt
FROM qryCount
GROUP BY qryCount.txt
PIVOT qryCount.txtChar;

**OldBirdman** · Feb 12 '09, 04:14 PM

Ready-Fire-Aim

We don't know, and nobody has asked, why this is needed. For the example given, text = "76DC4-34", I can't think of any reason why this might be needed. I assumed English (26 letters) and not Klingon, Russian, or Arabic.
I think the way to proceed depends on this answer. Is it 1) a One-Time study, 2) a query or table field(s) even if temporary, or 3) a calculated control? When shahaba says "I assume I'm going to have to repeat a process 36 times ... ", does this mean he will be working with 36 values, or can he reduce it to one value, say the largest? If the answer is 1), then a list might be appropriate. If 2) or 3), this has to be reduced to a single number or string. These require different solutions.

**FishVal** · Feb 12 '09, 04:36 PM

:D

Aim? Who does care. Just fire.

**NeoPa** · Feb 16 '09, 03:40 PM

I think the OP forgot to come back (Either that, or they were frightened off by all the high-faluting "discussion s").

In truth though, it's always interesting to get the points of view of other experts (Yes OB - That includes you too, regardless of tag), even when they are not strictly required for the original question.

**OldBirdman** · Feb 16 '09, 04:28 PM

NeoPa, as the only non-expert here, I am glad you appreciate my input.
"shahaba" is a Newbie (1 post), and needs to be invited back into this discussion.
shahaba, can you give a few more details, such as how you are going to use this number. Is it a field in a table, an unbound control on a form or report, a variable in a VBA formula, a one-time special study, or ...?
You write:

I assume I'm going to have to repeat a process 36 times (10 for each numeric digit and 26 for each letter, ... )

which means you may need 36 values. How are these to be used. Maybe you really don't need a query, only a function. You have 4 experts interested here, lets get to a solution.
Don't give up on this forum, it may be the best one out there!

**NeoPa** · Feb 16 '09, 11:11 PM

Originally posted by OldBirdman

NeoPa, as the only non-expert here, I am glad you appreciate my input.

I certainly do ;)

Particularly how you look to get the precise question sorted out as a basis for continuing (not just in this thread).

**shahaba** · Feb 18 '09, 03:39 PM

Sorry for not responding sooner, this issue was put on the backburner for a while for various reasons. So far the information has been useful even though most of what is said here is going over my head. Let me answer what I can and if you want any more information I'll be checking back regularly.

I think that getting the results as fields in a table might be the most useful.

What I meant by 36 times is refering to if there was such built in function I'd be using it once for each digit 0-9 and once for each letter.

To be clear, I need the sum of the characters in all records of a table not the sum of each character in for each individual records

This in not a 1 time study. Currently the actual table I plan to run this on has 25000 records of varying length up to 14 characters. But I'll likely need to run this process everytime we get new data to append to this table.

**NeoPa** · Feb 18 '09, 05:25 PM

Is the data for each record contained in a single field?

PS. With this new information we'd certainly be looking at doing it differently. I'll give it some thought.

**OldBirdman** · Feb 18 '09, 06:56 PM

Does it matter as strAll = Str1 [& Str2 [& Str3 ...]]?
A function is going to be called 350,000 times. Isn't the question whether to do this in queries or code?
We're calling f(strAll, strLetter) for each letter 0-9, A-Z for each record.
Solution 1 - Queries - 1) SELECT f(strAll, "A") AS CountA, ..., f(strAll, "9") AS Count9 will result in 25,000 rows, 36 fields. 2)SUM values in each field
Solution 2 - Code (logic) - Do Until EOF; Do strX = "A" to "Z", "0" to "9"; txtLetter&strX = txtLetter&strX + f(strAll, strX); where txtLetter&strX are textboxes on a form.
Function f returns Len(strAll)-Len((Replace(st rAll,strX, "")) per ADezii and this code could replace actual call either in SQL Statement or Code.
Disclaimer - The above is not intended to be code in any language. It is intended to point this discussion in one of two directions.
shahaba - My inquisitive mind would like to know the reason behind this need. Can you tell us. OldBirdman

**NeoPa** · Feb 18 '09, 07:35 PM

Firstly, I'm going to proceed here, but please remember to respond to OB's post too.

In my considered opinion, the very real benefit of using Replace() for a single character, is more than lost when it has to be run on all the characters in the alphabet + numerals. This may not be true for an enormous string, but if we have many records then just putting the data together into a string would be enormously processor intensive, not to say time consuming.

I'm thinking on the basis that if the data in a record is up to 14 characters long, it's almost certainly in a single field.

I can't think of a reliable and efficient way of arranging the data such that it can be worked on as a single piece of data (string) so I've concluded the best way to handle this is to use DSum() to sum the values produced (by a function call we'll get to later) for each record in the table.

The function must essentially return the count of qualifying characters in the string data passed.

**NeoPa** · Feb 18 '09, 07:52 PM

The following function should do the trick :

Code:

Public Function CountValid(strData As String) As Long
  Dim intX As Integer

  CountValid = 0
  For intX = 1 To Len(strData)
    If InStr(1, _
             "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789", 
             Mid(strData, intX, 1), _
             vbTextCompare) > 0 Then CountValid = CountValid + 1
  Next intX
End Function

Use vbBinaryCompare in line #9 if only upper case characters are required.

To get the total simply use :

Code:

Call MsgBox("Valid chars in database = " & _
            Format(DSum("CountValid([YourFieldName])", _
                        "[YourTablename]"), "#,##0\."))

**NeoPa** · Feb 18 '09, 07:58 PM

If the format of the data is known always to fit any type of template, then that may well effect how best it can be done (the only characters that are ever included in the data, but which should not be counted, come from a list shorter than 36 for instance).

Otherwise there is an open invitation to post ideas. We're looking for those which are more efficient in execution rather than less though, so please review some of the known issue laid out earlier before contributing.

**shahaba** · Feb 18 '09, 08:53 PM

The purpose of all this is creating Identity Tags by piecing these tags from single characters which must be requested in advance.

The data for each record is contained within a single field

Other characters that may appear are \ / -

**Unused01** · Feb 18 '09, 09:41 PM

OldBirdman replies

The purpose of all this is creating Identity Tags by piecing these tags from single characters which must be requested in advance.

Maybe this is an idea that has been proposed but should be forgotten. Random numbers or ? might be another approach to solving the underlying problem, which is to create Identity Tags.
In my defense, I realized that the data was probably in 1 field, but if not, the 2 or 3 fields could be concatinated, thus this issue could be ignored. I never considered creating a string of many thousands of characters, but 14 characters max.
The function "Function CountValid" does not do what is requested. It counts the total number of characters that are one of the 36 allowed. It does not count the number of each letter (i.e. How many "K"s?). From post #1:

I have a list of texts and I need to find out the number of times each character shows up.

**NeoPa** · Feb 18 '09, 09:47 PM

Originally posted by shahaba

The purpose of all this is creating Identity Tags by piecing these tags from single characters which must be requested in advance.

How does counting the total number of these 36 characters relate to this :S

Count of specific character in a text

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment