[-] nulldev@lemmy.vepta.org 0 points 1 year ago* (last edited 1 year ago)

There's only incentive to do that if the mempool is empty. If the mempool is full, there will be plenty of transactions for both the first miner and the next miner.

Wait... This entire paper only makes sense if the mempool is near empty. If the mempool is full, then there is no reason to mine an empty/partial block because there will always be transactions left for future miners.

So basically:

  • Mempool full = miner would mine full blocks just like intended.
  • Mempool empty = miner would mine empty blocks but that isn't a problem because there are no transactions to process in the mempool.
[-] nulldev@lemmy.vepta.org 2 points 1 year ago

I think we generally agree but I just want to clarify anyways. I'm not saying we should use PNG to store frames from videos.

What I am saying however, is that we should replace PNG with a modern lossless image format that is more flexible so users don't have to deal with these issues. All this colorspace stuff should be automatically handled and I shouldn't have to worry about it not being lossless. If I want to save a frame of video, I should be able to do it using an image format that everybody recognizes and accepts, it should not be a huge hassle and it should be fully lossless.

[-] nulldev@lemmy.vepta.org 1 points 1 year ago* (last edited 1 year ago)

There's no real reason why you shouldn't use PNG for a frame of video. I'm not talking about using it as a video format, I'm talking about extracting a frame from a video and sending it off to an editor for inclusion in another video or image.

As a user, I would expect that I could use the most popular lossless image format if I want to losslessly share a frame from a movie with someone.

Of course I do agree that we need adoption of other image formats. We really should not still be cramming everything in PNGs or JPGs in 2023.

[-] nulldev@lemmy.vepta.org 1 points 1 year ago

I know, it’s my code.

Wow, very nice! First of all, I will preface by admitting that I have not worked with LLMs to the degree of making a toy implementation. Your explanation of the sampling techniques is insightful but doesn't clear up my confusion. Why does sampling imply the absence of higher level structure in the model?

For example, even though poker is highly influenced by chance, I can still have a plan that will increase my likelihood of winning. I don't know what card will be drawn next but I can prepare strategies for each possible card. I can have preferences for which cards I want to be drawn next.

[-] nulldev@lemmy.vepta.org 2 points 1 year ago

No, I am describing how they actually work.

First of all, this link is just to C# bindings of llama.cpp and so doesn't contain the actual implementation. But it also doesn't refute my criticism of your claim. More specifically, I take issue with this statement that you said: "today’s LLM’s ingest a ton of text [snip] and builds up statistics of which tokens it sees in that context".

I claim that this is not how today's LLMs work because we have no idea what LLMs do with the input data during training. We have very little insight into what kind of data structure it builds and how the data structure it built is organized.

No, it’s done because one letter at a time is too slow. Tokens are a “happy” medium tradeoff.

I think I worded my sentence ambiguously, let me re-word it for you: "Going one token at a time is only considered a limitation because LLMs are not accurate enough"

It makes a “break” of the block, which lets it start a new answer instead of continuing on the previous. How it reacts to that depends on the fine tune and filters before the data hits the LLM.

Once again, my sentence was not written well, my bad. I was commenting on the observed behavior, not on how it works from a technical perspective.

I have just said that LLM’s we have today can’t fix the problems with false data and hallucinations, because it’s a core principle of how it operates. It will require a new approach.

You could add a rocket engine and wings to a pogo stick, but then it’s no longer a pogo stick but an airplane with a weird landing gear. Today’s LLM’s could give us hints to how to make a better AI, but that would be a different thing than today’s LLM’s. From what has been leaked from OpenAI GPT4 has scaling issues so they use mixture of experts. Just throwing hardware at it is already showing diminishing returns. And we’re learning fascinating new ways of training them, but the inherent problem is the same.

Alright, we agree here for the most part so I'm just going to skip this.

For example, if you ask an LLM if it can give an answer to a question, it will have two paths to go down, positive and negative. Note, at the point where it chooses that it doesn’t know how to finish it, it doesn’t look ahead.

This is weird though. How do you know LLMs can't look ahead? When we prompt LLMs, we are basically asking them this question: "What is the next word of your response?" How do you know it hasn't written out the entire response in memory already after which it only shows you the first word? LLMs are neural networks. Neural networks have working memory. That's how neural networks work after all, it's just a vector of data that is repeatedly transformed as it passes through each layer. Of course, if it does write the entire response in memory, it is all thrown away after every word.


As far as the backspace tokens go, you are right to be skeptical but also do not be surprised if it works out. We've had LLMs trained to complete and edit text for some time already. They've fallen out of use today but they did perform acceptably well.

[-] nulldev@lemmy.vepta.org 0 points 1 year ago* (last edited 1 year ago)

A significant number of subreddits shifting to private caused some expected stability issues, and we’ve been working on resolving the anticipated issue.

How in the world does setting a bunch of subs to private crash the website?

view more: ‹ prev next ›

nulldev

joined 1 year ago
MODERATOR OF