Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You will not be able to reconstruct the regex by a timing attack unless you make some assumptions on the input like maximum length and even then reconstructing the regex will be tough. If you don't make a maximum length assumption then the best you can do is create a string that will pass it because you will never be able to tell the difference between /a+/ and /a{1,10^99999999999999}/. Practically this might not make a difference but theoretically it does.


Actually, while that was my first thought as well, it depends on the underlying implementation. I'm not positive here, but I think that the characteristics of the regex engine could allow you to recognize the difference between /a+/ and /a{1,1000}/. That said, I haven't done anything to this end yet -- we'll see if my idea remotely pans out. It'll certainly require knowing what regex engine you're attacking, unlike just generating data.


You're probably right. I think the idea is really cool and I'm surprised some CS grad student hasn't jumped on this stuff yet. There is a lot of theory lurking in the background for this kind of stuff and it would definitely make a nice master's thesis.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: