logict wrote:Can you tell me something about the performance of eValid when you use the regular expression feature? On my tests when I do a RegEX search it seems as if the eValid footprint grows a LOT! Anything you can suggest?
To understand the performance issue here you need to appreciate what work the eValid DOM scanning is doing when it is searching the page for a Regular Expression (RegEX) match to an element property/attribute value.
The matching process involves invoves reading the value of each property/attribute of each element on the page and then determining if the current regular expression is matched within the string that is extracted from that element. In many web pages the total number of property/attribute values to examine is in the range of 1000 to 5000, but in some complex pages this can be 100,000 or more. That is the first factor.
Now, consider that for each string that is pulled out of the DOM, if you have chosen the regular expression (RegEX) match option, that eValid needs to start the match process at the first character of the string, and the proceed sequentially through ALL of the characters of the string. Remember, this is not the same as doing a substring search, which requires looking at only the letters that match the first letter of the target string. For a RegEx you need to scan forward in the string to see whether the current string fits the RegEX criteria. That is the second factor.
Lastly, the running time of the RegEX is going to grow exponentially as the complexity of the RegEX grows. That's the third factor.
When you put all three factors together you can appreciate that a complex RegEX applied to a page with a lot of elements that are complex can amount to a lot of work. And, that the work involved grows not linearly, but exponentially with these three factors, which multiply the work.
Hope this is helpful.
--The eValid Team