Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm using 4 features of the article: Title, sentence length, sentence position, and modified keyword frequency. The first three features are just normal that you can see in most automatic summarization research. Modified keyword frequency considers not just the frequency, but also the distance of each keyword. :) There. That's just a brief explanation of how TextTeaser works.


You mention in the article that you developed the algorithm as a part of writing your MSc thesis. Is your thesis available on the Internet (as in pdf or in any other format)?


Will ask my adviser about it and upload the pdf later. :)


FWIW I'd love to see this too. We have a semi-regular paper-reading club at GoCardless (YC S11) and this could be super interesting.


What about a teaser of it in the mean time? ;)


I was reading about various topics related to summarization recently, and I'm just wondering if your "modified keyword frequency" is some type of ConcGram.

http://www.lexically.net/downloads/version5/HTML/definition_...


So where can we read the technical details of the algorithm or look at the source, if available?


I didn't get if that is extraction or abstraction...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: