Hacker News new | past | comments | ask | show | jobs | submit | more renegat0x0's comments login


Also https://minifeed.net/ which I maintain; soon reaching 1000 personal blogs indexed.


As someone who runs very simple crawler, I hope these actions will not affect me that much. I want to be able to collect data and be able to share it

Results of my crawling

https://github.com/rumca-js/Internet-Places-Database


To be honest I feel that web2 is overrated.

Most of content, blogs could be static sites.

For mastodon, forums I think user validation is ok and a good way to go.


This cannot be further from the truth. Ad business is not going anywhere. It will grow even bigger.

OpenAI goes through initial cycle of enshittification. Google is too big right now. Once they establish dominance you will have to see 5 unskippable ads between prompts, even for paid plan.

I solved user problems for myself. Most of my web projects use client side processing. I moved to github pages. So clients can use my projects with no down time. Pages use SQLite as source of data. First browser downloads the SQLite model, then it uses it to display data on client side.

Example 'search' project: https://rumca-js.github.io/search


The stated problem was about indexing, accessing content and advertising in that context.

> I solved user problems for myself. Most of my web projects use client side processing. I moved to github pages. So clients can use my projects with no down time. Pages use SQLite as source of data. First browser downloads the SQLite model, then it uses it to display data on client side.

> Example 'search' project: https://rumca-js.github.io/search

That is not really solution. Since typical indexing still works for masses, your approach is currently unique. But in the end, bots will be capable of reading on web page context if human is capable on reading them. And we get back to the original problem where we try to detect bots from humans. It's the only way.


To be honest I think it does not matter.

I wish they split search from Google. Either you are search business, or ad business.


To be honest browser is not that important for me. It collects a lot of data about you, but I think search engine is more important for society. It is the lens through which we see the world.

I have already seen that many folks switched to using several engines, because you see more that way. Personally I like searxng. There is gpt also obviously.

Sometimes I also search domains I crawled.

https://github.com/rumca-js/Internet-Places-Database


I think that majority prefer "algorithms" to serve them information, and that is a killer of any other solutions for media consumption, and for RSS.

People do not want to 'hunt' for information, they prefer to be served. This opens doors to tiktoks, facebooks, and other. This opens door to feed manipulation, where big tech can decide what you see, and what not.

That is why I have little regard for "what majority" is using, as commonly it is lowest possible denominator. Often such "algorithms" are abusive, and the take advantage of you, of your lizard brain, and your attention.

I am sure there is no golden solution, no ideal. That said I use mostly RSS, but sometimes I also check the "algorithm" of reddit to check what is happening outside of my bubble.

This gives me opportunity to taste both worlds, and have some sanity and middle ground.

I also like small blogs, than let's say Facebook posts, as the latter easily turn into quarrel, shouting, and small blogs offers insight without much social media distractions.


> I think that majority prefer "algorithms" to serve them information

This possibility was the theme of the (most?) recent video essay on the YouTube Technology Connections channel:

https://www.youtube.com/watch?v=QEJpZjg8GuA

There's definitely something to the appeal of algorithmic feeds, I suspect there's a fundamental neurological mechanism* in play which encourages us to scan somewhere for novel appearances / developments, same as seeing what's in the garden or on the market or who is present at the oasis etc etc.

But there's probably more to be said for consciously directing attention.

(*There's a certain d-word that probably has a genuine neurochemical meaning when not laden by a lot of popular baggage as it's entered colloquial use that I am intentionally not using because at this point I'd suspect it encourages misunderstanding more than helps, but I'm willing to believe it's related.)


relevant post: https://nothinghuman.substack.com/p/the-tyranny-of-the-margi...

> Simply put, companies building apps have strong incentives to gain more users, even users that derive very little value from the app. Sometimes this is because you can monetize low value users by selling them ads. Often, it’s because your business relies on network effects and even low value users can help you build a moat. So the north star metric for designers and engineers is typically something like Daily Active Users, or DAUs for short: the number of users who log into your app in a 24 hour period.

> What’s wrong with such a metric? A product that many users want to use is a good product, right? Sort of. Since most software products charge a flat per-user fee (often zero, because ads), and economic incentives operate on the margin, a company with a billion-user product doesn’t actually care about its billion existing users. It cares about the marginal user - the billion-plus-first user - and it focuses all its energy on making sure that marginal user doesn’t stop using the app. Yes, if you neglect the existing users’ experience for long enough they will leave, but in practice apps are sticky and by the time your loyal users leave everyone on the team will have long been promoted.

> So in practice, the design of popular apps caters almost entirely to the marginal user.


Is the word “degenerate”?


Dopamine.


I concur. "Broken Code" book by Jeff Horwitz at the Wall Street Journal amply elaborates how business metric driven algorithms stir user attention at their convenience, that may or not agree with the user's. For one data point, I prefer to be gourmet than to be fed.


Might be a little bit off topic. I created a web page, with data. I didn't want to host VPS, be charged for traffic. I do not want also to play with cloudflare and self-hosting.

My solution? The app is a webpage, which reads SQLite. If user wants to use app, the database is downloaded, unpacked, and used on users device.

Links:

- https://github.com/rumca-js/Internet-Places-Database - search.html provides a preview for my database file (but code supports also reading zip file).

- https://rumca-js.github.io/search?file=top&page=1&search=neo... - uses JSON files stored in a zip file. Will be replaced soon with zip + sqlite file

- https://rumca-js.github.io/search?file=music&view_display_ty... - example showing my fav music. As above uses JSON files, in a zip file


this is nice. i like the idea which has been tried in a few places of running sqlite in the browser directly/locally. the only thing that is really missing to make this work at a bigger scale for read-heavy databases is a very cheap or free static hosting service which does range requests, allows you control of CORS and doesn't have the file size limitations of gist or github pages. maybe this exists already? S3 would do i guess?

you can do kinda magic things like this and build websites that connect to multiple different databases around the web and... well, i'll leave the rest up to your imagination.

go here: https://just.billywhizz.io/sqlite/squeel/

hit CTRL/CMD + Q on your keyboard.

paste in this sql

``` attach database 'https://raw.githubusercontent.com/just-js/just.billywhizz.io...' as chinook ;

select * from albums ; ```

and hit CTRL/CMD + g to run the queries.


I mean if you only have a few thousand records you barely need a database at all.


> I mean if you only have a few thousand records you barely need a database at all.

Keyword being "barely".

There are organization benefits if you can structure your data into a DB, instead of having each page redundantly hold the same header & metadata info.


Previously I have been using JSON. However there are multiple structures with relations between them so... this seems to be a database.

Extracting data from it also becomes really really easy with selects. Otherwise I would have to implement, or reuse some algorithms to filter JSON data, etc.


read this in firefox/waterfox


This is not your computer. This is not your operating system. This is not your browser.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: