Hacker Newsnew | past | comments | ask | show | jobs | submit | ekzy's commentslogin

https://github.com/tailwindlabs/tailwindcss.com/commits/main...

They've just added 26 sponsor companies in the last two days, 7 of them partners!



I’ve implemented OTEL for background jobs, so async jobs that get picked up from the DB where I store the trace context in the DB and pass it along to multiple async jobs. For some jobs that fail and retry with a backoff strategy, they can take many hours and we can see the traces fine in grafana. Each job create its own span but they are all within the same trace.

Works well for us, I’m not sure I understand the issue you’re facing?


Ok after re reading I think you have issues with long running spans, I think you should break down your spans in smaller chunks. But a trace can take many hours or days, and be analysed even when it’s not finished


French native here. Libre means free in the sense of freedom, not in the sense of free as in free beer (the term for that is “gratuit”)


Yeah, that’s what I was saying. Spanish native myself :D


Did you consider also caching the coordinates returned by moondream? I understand that it is cheap, but it could be useful to detect if an element has changed position as it may be a regression


So the problem is if we cache the coordinates and click blindly at the saved positions, there's no way to tell if the interface changes or if we are actually clicking the wring things (unless we try and do something hacky like listen for events on the DOM). Detecting whether elements have changed position though would definitely be feasible if re-running a test with Moondream, could compared against the coordinates of the last run.


sounds a lot like snapshot testing


Not saying that you are, but reading this as if a AI bot wrote that comment gives me the chills


Do you know when we can expect an update on the realtime API? It’s still in beta and there are many issues (e.g voice randomly cutting off, VAD issues, especially with mulaw etc…) which makes it impossible to use in production, but there’s not much communication from OpenAI. It’s difficult to know what to bet on. Pushing for stt->llm->tts makes you wonder if we should carry on building with the realtime API.


we're working hard on it at the moment and hope we'll have a snapshot ready in the next month or so

we've debugged the cutoff issues and have fixes for them internally but we need a snapshot that's better across the board, not just cutoffs (working on it!)

we're all in on S2S models both for API and ChatGPT, so there will be lots more coming to Realtime this year

For today: the new noise cancellation and semantic voice activity detector are available in Realtime. And ofc you can use gpt-4o-transribe for user transcripts there


Agreed- really not liking how they are neglecting it… I hope they are just hard at work behind the scenes and will release something soon


S2S is where we're investing the most effort on audio ... sorry it's been slow but we are working hard on it

Top priorities at the moment 1) Better function calling performance 2) Improved perception accuracy (not mishearing) 3) More reliable instruction following 4) Bug fixes (cutoffs, run ons, modality steering)


Appreciate the efforts. It’s not there yet, but when it gets there it will open up a lot of use cases.

Any fine tuning for s2s in the horizon?


I’m not entirely sure what you mean but twilio recordings supports dual channels already


Transcribing Twilio's dual-channel recordings using OpenAI's speech-to-text while preserving channel identification.


Oh I see what you mean that would be a neat feature. Assuming you can get timestamps though it should be trivial to work around the issue?


There are two options that I know of:

1. Merge both channels into one (this is what Whisper does with dual-channel recordings), then map transcription timestamps back to the original channels. This works only when speakers don't talk over each other, which is often not the case.

2. Transcribe each channel separately, then merge the transcripts. This preserves perfect channel identification but removes valuable conversational context (e.g., Speaker A asks a question, Speaker B answers incomprehensively) that helps model's accuracy.

So yes, there are two technically trivial solutions, but you either get somewhat inaccurate channel identification or degraded transcription quality. A better solution would be a model trained to accept an additional token indicating the channel ID, preserving it in the output while benefiting from the context of both channels.


(2) is also significantly harder with these new models as they don’t support word timestamps like WHISPR.

see > Other parameters, such as timestamp_granularities, require verbose_json output and are therefore only available when using whisper-1.


Probably referring to startup time. Larger apps solve this with the “reloaded” type of workflow (https://www.cognitect.com/blog/2013/06/04/clojure-workflow-r...)


I would often leave repls running for days so it never seemed to matter.


Call it “stalled” if you like, it’s stable and it’s pure joy. I can just get stuff done with Clojure. And things that may seem inactive, like that lib that you need that hasn’t had a commit in 8 years, turns out that it just works and doesn’t need to change. This is commonplace in Clojure.

Spec is still alpha and I’m not sure it will evolve more or if it’ll be something completely different. At least they’re not pushing you down the wrong path. Use/look at Malli instead of spec.


There are things in the core language that are still kind of bullshit. The thing that gives me the most headaches is how annoying it is to use Java libraries that have the lambda syntax.

You can't just pass in a Clojure `fn` into a Java lambda function. You have to `reify` the interface and implement the single method. It's annoying and verbose and frustratingly the equivalent code is considerably cleaner in Java as a result.

I know that this is a product of how Java implemented lambdas by having interfaces with a single method, and I'm not saying that it would be trivial to add into Clojure, but I don't think it's impossible and I think people have been complaining about this for more than a decade now.

So while I love Clojure, it's probably my favorite language, I do get a little annoyed when people act like it's "stable" because there's nothing to fix.


I think that particular issue was addressed in the recent Clojure release [https://clojure.org/news/2024/09/05/clojure-1-12-0] ("Clojure developers can now invoke Java methods taking functional interfaces by passing functions with matching arity.")


Looks like you're right! I stand corrected, it's admittedly been a few months since I've touched Clojure.


You were not wrong though, Java 8 was released in 2014, so it only took them ~10 years ;)


Yeah, and most of my professional Clojure experience was from 2018 to 2021, and a few personal projects in 2022 and 2023, and that lack of support for the functional APIs really annoyed me; I was writing a Kafka Streams application, which uses lots of lambdas, and it annoyed me enough to rewrite it with vanilla Java.

I'm sure it's a difficult thing to implement, so I'm a little forgiving, but considering that Java interop is one of the biggest selling points for Clojure I do think it's fair to criticize a bad experience with it.

I need to play with the newer stuff though; the linked changes seem cool as hell.


> At least they’re not pushing you down the wrong path

No, but the release of spec single-handedly killed Schema, a more mature contract lib.

Then spec stagnated, because the core team didn't want to work on it, but as a core team library, nobody else can work on it, either.

Thus was born Malli, because people got tired of waiting.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: