Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I require the title because I need it for the algorithm.


That explains the many tests I just did with random copy and pasted articles. I just typed gibberish into the title. I mean, not all texts need to have a title.


Maybe in the future I can improve it without requiring the title. It may produce good results to other type of texts, but right now, TextTeaser is meant to be used for news articles.


Since the title (headline) of a news article usually summarizes it, TextTeaser arguably is less an article summarizer than a headline expander...

EDIT it would be nice, from a UX POV, to request the title if it's missing, rather than silently deleting the story... also, you might emphasis the importance of it (because it doesn't seem important at all). Perhaps just labeling it as "headline" or "subject" instead of the generic "title" would help.


Congratulations for open sourcing the library. Do you think, it could generate a title as a one sentence summarization?


Looks like the algorithm is giving weight to h1 and h2 tags in the page markup having just tried it on some of my pages. Is that true or am I imagining it?

If so, I'll have to provide more literal subheadings!


Nope. :) You are imagining it.


Fair enough. I must have used more relevant subheadings than I thought!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: