123
-=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- (c) WidthPadding Industries 1987 0|696|0 -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=-
Socoder -> Python/Ruby -> Regular Expression help

Fri, 07 Dec 2007, 20:49
JL235
I'm having trouble writing a regular expression, because I think it's impossible. I believe it might be more of a job for a grammer, but I have a different (but worse) solution lined up if I can't do it with a regular expression. What I want to do is have a string of characters, any length, of lets say b's and a's. I am trying to write an expression which will match a minimum of 8 a's anywhere within 28 consecutive characters, within that string.

The only way I can think of doing this would be to type out every combination of a's and b's with an | between each one. Is there a proper way?

And this actually has nothing to do with Ruby (I'm writing this in Java). We just don't have a general coding section.
Sat, 08 Dec 2007, 00:53
Dabz
Dont know anything about Ruby, but here's a rough and ready example I made ages ago for Blitz:-



That might shed some light!

Dabz

-=-=-
Intel Core i5 6400 2.7GHz, NVIDIA GeForce GTX 1070 (8GB), 8Gig DDR4 RAM, 256GB SSD, 1TB HDD, Windows 10 64bit
Sat, 08 Dec 2007, 04:03
Jayenkai
Tip : When dealing with RegEx, I tend to load up Programmer's Notepad and use the Find panel, which contains a RegEx function.

AAAAAAA+A = A group of 8 or more A's

-=-=-
''Load, Next List!''
Sat, 08 Dec 2007, 08:39
JL235
Thanks MaDanthon, but that is essentially my backup plan. To make an algorithm that will search through counting by hand, although the one I have planned is slightly different and bigger. The thing is that the other rules I'm applying to this were all done in regular expressions too. To do this one as a regular expression would mean I used 2 lines of code instead of say 20 and would be far easier.

As I said above, it's also not written in Ruby. But Ruby and Python support regular expressions in the language, so it kinda looks like a decent place to put it.

Jay, it's not just about ensuring there are 8 a's. The regular expression needs to match the entire string which might be say 100 characters long. However every 28 consecutive characters needs to contain a minimum of 8 a's. There could be characters between each a, there could be characters after the a, there could be characters before the a. Doesn't matter as long as every 28 characters contains at least 8 a's.
Sat, 08 Dec 2007, 15:23
Jayenkai
So.. You mean.. All you need is to count the number of A's in a chunk?

Cut chunk, regex strip everything but A's, then count Len.




-=-=-
''Load, Next List!''
Sat, 08 Dec 2007, 16:11
JL235
that wouldn't work. What if it was 30 a's and then 70 different characters. It would be valid up to character 50. Then the consecutive sections of 28 characters would start running out of a's. Remember this has to be consecutive.

I have also today implemented a non-regular expression version.
Sat, 08 Dec 2007, 16:58
Jayenkai
Could you give an example, because I really don't have a clue WTF you mean..!

-=-=-
''Load, Next List!''
Sat, 08 Dec 2007, 17:07
Dabz
What Greenie said!!!

I'm lost too and have been trying for the last 15mins to wangle my brain round it... baring in mind... I'm drunk!!!

Which should've helped!

Dabz

-=-=-
Intel Core i5 6400 2.7GHz, NVIDIA GeForce GTX 1070 (8GB), 8Gig DDR4 RAM, 256GB SSD, 1TB HDD, Windows 10 64bit
Sat, 08 Dec 2007, 17:47
HoboBen
Maybe look up Backus Naur Form - "a precise language for presenting a grammar; most grammars are written in variants of BNF."

wiki I keep meaning to write my own parser - it looks capable of nearly anything, and it's a heck of a lot simpler than RegExp!

e.g.

To define a valid variable name for a programming language, starting with a letter then using any alphanumeric letters, you could do:



-=-=-
blog | work | code | more code
Sat, 08 Dec 2007, 18:07
Dabz
Now I'm totally lost!!!

Are we looking for a simple matching selection of a certain charactor within a string, or are we defining a regular expression?

I've wrote a small interpreter in Blitzbasic:-

www.syntaxbomb.com/forum/index.php/topic,113.0.html

If thats any help... But since hobo's post (And I mean no disrespect)... I'm goosed!!!

Dabz

-=-=-
Intel Core i5 6400 2.7GHz, NVIDIA GeForce GTX 1070 (8GB), 8Gig DDR4 RAM, 256GB SSD, 1TB HDD, Windows 10 64bit
Sat, 08 Dec 2007, 18:18
JL235
First, your example is a little wrong HoboBen. You should really do a lot of that in the lexical analyser picking out tokens (like words, digits, operators, brackets, braces, etc). You then put them together in the grammer. Your example seems to be trying to do both.

But I don't want to use a grammer because I'd have to connect CUP and JFlex into my project. If all that was done, then it'd be less code. But since it's not, it's a lot more and would make it more complicated.

I've now decided to use the backup option and not use the regular expression. Instead I've written a method about 20 lines long which converts it to an array of characters and looks through counting how many a's there are. When the index valud, i, becomes greater or equal to 28 I then start subtracting how many a's are at i-28. When it is in this mode there should always be 8 or more a's counted.
Sat, 08 Dec 2007, 18:45
Dabz
It seems like you have it working JL235, and my apologies for not having a clue what you said (And I still dont)....

Good luck on the project.. Whatever the hell it is! hehehe

Dabz

-=-=-
Intel Core i5 6400 2.7GHz, NVIDIA GeForce GTX 1070 (8GB), 8Gig DDR4 RAM, 256GB SSD, 1TB HDD, Windows 10 64bit
Sat, 08 Dec 2007, 19:52
JL235
I can't go into it too much because it's for my assignment. I'd past the code but I'd rather now have sections strewn across the internet. However I have also just realised that although some of the rules I cannot create a regular expression that checks for a valid string, I can create one that finds an invalid string which is just as useful.

When it's complete (and if the other guys in my group agree) I will be uploading it as a showcase. So far I'm pulling my hair out over this. It's such a mess and yet only 1,600 lines.
Sat, 08 Dec 2007, 20:40
power mousey


I'm dazed and confused too.
ack!! 1600 lines.

Good Luck!, JL.
Sun, 09 Dec 2007, 03:24
Dabz

1600 lines.


pffft... Its just a baby! hehehe

Dabz

-=-=-
Intel Core i5 6400 2.7GHz, NVIDIA GeForce GTX 1070 (8GB), 8Gig DDR4 RAM, 256GB SSD, 1TB HDD, Windows 10 64bit
Sun, 09 Dec 2007, 12:43
power mousey
um no.

to check for a valid string within a bigger string
and use 1600 lines of code.
Me thinks he should re-code or re-think the design
and algorithm to find a matching pattern and count the specific chars(bits) of the pattern with the bigger pattern or string.

hopefully, you will understand what I'm saying. I don't want nor like to write another verbose blog about it..if you don't follow me nor understand what I'm saying.
Sometimes, its a waste of my time and just not worth the common sense and obviousness of it all.
So...hopefully, you will understand.

Thats all folks.