this post was submitted on 09 Nov 2025
30 points (100.0% liked)

technology

24104 readers
587 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 5 years ago
MODERATORS
 

The article gives no indication as to how the model works (other neural networks) so it could be a fluke especially since it is not physics based.

The arstechnica piece has this at the bottom:

“It’s not immediately clear why the GFS performed so poorly this hurricane season,” Lowry wrote. “Some have speculated the lapse in data collection from DOGE-related government cuts this year could have been a contributing factor, but presumably such a factor would have affected other global physics-based models as well, not just the American GFS.”

top 8 comments
sorted by: hot top controversial new old
[–] Philosoraptor@hexbear.net 28 points 2 weeks ago* (last edited 2 weeks ago) (2 children)

Effortpost incoming!

This is literally my exact area of specialty (foundations of model building in weather and climate science), so I have a bunch of thoughts. There are two assumptions here: first, that all we want out of our models is predictive accuracy, and second that we should be using only the model that generates the best predictions. Neither is true.

Imagine an alien came down from space and handed you a kind of souped up Magic 8 Ball. The alien 8 Ball works just like a regular one--you ask it a question, flip it over, and it gives you an answer in a little window--except it is really accurate. You can ask it "where will this hurricane be in 3 days?" and it'll tell you, with great accuracy, every time. The same is true for basically any other physical prediction you want to make.

Would this be a good reason to shut down every scientific investigation program in the world? Would the 8 Ball just have "solved" science for us? Pretty clearly not, I think. It might be a really useful tool, but science isn't just a machine that you feed questions into and read predictions off of. Part of why we do science is to explain things and identify general patterns in how the world changes over time. Among other things, that helps us refine the questions we're asking, as well as discover new ones that we hadn't thought to ask.

There'd be a natural question, I think, of why something like the 8 Ball worked the way that it did. We'd want to know what it was about the internal structure of the ball and its relationship to the structure of the world that made it a good model. That's because part of what we use models for is structural investigation, not just point-forecasting. I want to be able to look at the structure of the model and learn something about the structure of the target system I'm studying--in fact, a lot of physical insights into the structure and function of the global climate have come about this way rather than from (as I think most people would expect) studying the climate system directly. Having a physics-based model facilitates that, because we can see how the structures and outputs of the model fit in with other patterns in the physical world with which we are more familiar. In addition, understanding the way that the model behavior emerges from low-level physical principles helps us have confidence in the stable projectability of the model's predictions. We can be a lot more confident, that is, that our model will remain reliable even across very different conditions because the physical principles are stable.

I don't have the same kind of confidence in a machine learning model because it is merely a pattern recognition engine--it's an 8 Ball. In fact, this is exactly the methodology we used for weather forecasting before meteorology emerged as a mature model-based science. The most popular method for forecasting the weather during the first part of the 20th century involved the use of purely qualitative maps of past weather activity. Forecasters would chart the current state to the best of their ability, noting the location of clouds, the magnitude and direction of prevailing winds, the presence of precipitation, etc. Once the current state was recorded on a map of the region of interest, the forecasters would refer back to past charts of the same region until they found one that closely resembled the chart they had just generated. They would then check to see how that past state had evolved over time, and would base their forecast of the current situation on that past record. This turned forecasting into the kind of activity that took years (or even decades) to become proficient in; in order to make practical use of this kind of approach, would-be forecasters had to have an encyclopedic knowledge of past charts, as well as the ability to make educated guesses at how the current system might diverge from the most similar past cases. This approach faded away as people got a better theoretical understanding of atmospheric physics and other relevant theoretical processes underwriting the weather and climate systems. Eventually, it was replaced entirely by computational modeling that's grounded in stuff like fluid dynamics, thermodynamics, and other relatively well-understood physical processes.

It seems natural (and correct) to say that something was added to meteorology when people started making predictions based on things like atmospheric circulation models and other well-articulated theories grounded in formal models, rather than just looking at past weather maps. It also seems like whatever it was that was added in that transition is at least partially independent of predictive success: even if the weather map method was about as good at predicting tomorrow's weather as the computational modeling method, the latter seems more like a mature science in virtue of explaining why tomorrow's weather prediction is what it is. In both this case and the 8-Ball case, the thing that seems missing is explanation.

Using machine learning (and only machine learning) is a step back to this way of doing forecasting. It would effectively be just relying on the old "charting" approach to weather forecasting, just done by a system that is very, very good at it--better than any human. An 8 Ball. It cuts off a whole branch of explanatory (and eventually predictively relevant) investigation, and may well fail catastrophically at even prediction when confronted with conditions it hasn't seen before. It's hugely shortsighted, and represents a fundamental misunderstanding of the role of models in science.

[–] Evilphd666@hexbear.net 6 points 2 weeks ago (1 children)

Effort question probably on the theoretical / opinion.

Do current models incorporate deliberate weather manipulation by man such as cloud seeding? Does human induced weather manipulation have a consequential effect on our models and actual butterfly effect IRL on weather systems?

[–] Philosoraptor@hexbear.net 5 points 2 weeks ago (1 children)

We're not actually very good at weather manipulation, either in theory or in practice. Maybe counterintuitively, we have a much better handle on what we'd need to do for climate manipulation (especially via aerosol injection), and there's definitely a robust research program investigating that, though it's relatively new. We started systematically studying geoengineering proposals as part of CMIP6 in 2015 (I was actually part of the inaugural working group!) and it's a pretty significant part of the overall effort now.

Our understanding of weather (as opposed to climate) manipulation is much shakier. The most high profile attempt to engage in it was probably in China before the Beijing summer Olympics, and there's not even a widespread consensus if it was successful. They attempted some significant cloud seeding to try to keep it from raining on the games, and it didn't rain, but we're not very confident that it was the cloud seeding that did it. Part of the reason to prefer physically grounded models, though, is that it's relatively easy to incorporate this stuff. The model doesn't care if cloud condensation nuclei are injected or naturally occurring. Would this have long-term butterfly type effects? Yeah, definitely, but the weather system is so chaotic that it honestly wouldn't really matter much. There's already a pretty hard time-horizon of about two weeks beyond which we might as well just be throwing darts to make predictions, and forecasts are really only reliable for 5-7 days out. Introducing deliberate weather manipulation wouldn't put us in a significantly worse position, and there are pretty hard to overcome mathematical reasons why it's challenging to improve forecasts beyond that timeframe already (at least for weather--obviously climate forecasting is different).

[–] Evilphd666@hexbear.net 2 points 2 weeks ago

Thanks for entertaining the question. Makes sense we're smol bean in the larger mix, but cool you were part of that. Thanks for sharing!

[–] BountifulEggnog@hexbear.net 3 points 2 weeks ago (1 children)

This is completely off topic and only related because of your work and my interest in climate change. What do you think of the assumptions mainstream climate science makes in their models compared with other numbers, like the amount of warming caused by co2 doubling? I've seen James Hansen put forward that it's much higher then mainstream models, closer to 4.5c. I forget the exact range both put forward.

Also, do models include things like arctic methane being released, boreal forests burning, etc? I again don't have numbers in front of me but those are fairly significant amounts of ghg right?

Would love to hear more about your work in general, climate change might be the single most important issue to me.

[–] Philosoraptor@hexbear.net 1 points 2 weeks ago* (last edited 2 weeks ago)

What do you think of the assumptions mainstream climate science makes in their models compared with other numbers, like the amount of warming caused by co2 doubling? I've seen James Hansen put forward that it's much higher then mainstream models, closer to 4.5c. I forget the exact range both put forward.

There's a lot of uncertainty surrounding this number (it's called the "climate sensitivity") and a pretty big range of estimates. Hansen's number is on the higher end of the range, though not the highest you can find in the literature. I haven't done a systematic survey of the literature lately, but my anecdotal impression is that the average estimate is shifting higher, especially in the last few years. As you say, a lot of the uncertainty comes from whether (and how much) a particular model includes the presence of various positive feedback mechanisms like permafrost melt-associated methane release or thermohaline shutdown. These sorts of things are, by their very nature, extremely hard to predict with a high degree of certainty, so the best we can do is assign a probability distribution over the relevant values. The overall predictions of the models depend pretty sensitively on the exact shape of those probability distributions, which in turn depend on the value of various other parameters.

In the biz, we call those kinds of things "highly tunable parameters." They're processes that the model isn't resolving explicitly in the model physics (we're not directly simulating the melting of permafrost, the location of methane deposits, and the associated release of GHGs, for instance) that also have a rather large range of "physically plausible" values. The classic example of a highly tunable parameter is cloud formation. Because of computational limitations, our most detailed models run on grids with squares that are on the order of ~150km to a side. That means that anything that happens at a spatial scale smaller than about 100km is invisible to the model, since it can't explicitly resolve sub-grid dynamics. Most clouds are significantly smaller than 100km, so we can't really model cloud formation directly (in the sense of just having the physics engine do it), but clouds are (obviously) really important to the state of the climate for a whole bunch of reasons. The way we get around that is by parameterizing cloud formation in terms of stuff that the model can resolve. Basically, this means looking at each grid square and having the model figure out what percentage of the square is likely to be covered by clouds (and at what elevation) based on various values that we know are physically relevant (humidity, temperature, pressure, etc.) in that square. This is imperfect, but it does a pretty good job and lets the model work with stuff that it can't directly simulate.

Lots of feedback mechanisms are like this also. For one reason or another, many of them are not things that we're simulating directly--sometimes because we don't have a good enough theoretical understanding, sometimes because we don't have the relevant data, and sometimes because (like clouds) they're operating a spatial scales below what the model can "see." But we all know that those things are important, so they're incorporated as parameterizations. The problem is that each abstraction step here introduces another layer of uncertainty: the relevant parameters are often highly tunable so there's uncertainty there, we're not sure exactly how strong the coupling constants are so there's uncertainty there, and we're not sure we have all the relevant processes parametrized. That's a big part of what explains the range of value estimates: depending on your preferred values for all those things, you can get a climate sensitivity as low as 2 or 3 degrees C and as high as 6 or 7.

Part of how we deal with that problem is through the use of ensemble modeling. The big "grand ensemble" project I mentioned (CMIP, which stands for "coupled model intercomparison project) involves many different institutions and labs running a standardized series of virtual experiments with a uniform set of initial and boundary conditions. Every few years, scientists will get together and hammer out a set of questions to answer, turn those into model experiments, and then go home and run the same simulations on each of their own home institution's in-house model. As part of that, those multi-model ensembles will incorporate what are called "perturbed physics ensembles," which involve holding initial conditions constant and exploring how systematically varying the values of different parameters changes the final output. This helps us explore the "value space" for these highly tunable parameters and see which things look to be sensitively dependent on which other things. The final consensus predictions (that you see, for instance, in the IPCC reports) are the result of integrating the results from all of these different ensemble runs that varied the underlying model physics (multi-model ensembles), initial conditions (initial condition ensembles), and parameter values (perturbed physics ensembles). That's why the official numbers tend to be in the moderate range: the ensemble approach "smooths out" the more pessimistic and optimistic predictions.

Is that a guarantee that the consensus result is more accurate? No, not really, but it's hard to see how we could do it any better. In particular, if there are systematic biases that infect all the major models (because, for instance, they're all descended from a small number of early ancestor models that made some bad assumptions), ensemble modeling won't fix that. Some models will also incorporate processes or parameters that others just ignore. If those models are "more right," then their predictions are probably closer to reality. That's very hard to see in advance, though. Hansen's predictions are more pessimistic than many others partially because he leans toward parameter values that ascribe a stronger (and less self-limiting) role for positive feedbacks than many others; it's looking increasingly like reality is bearing that out. There have also been some big surprises recently that almost nobody saw coming, like the collapse of land-based carbon sinks starting in 2023. Those sorts of long-tail processes are very, very hard to incorporate into models until after the fact because they represent "unknown unknowns," but the general trend has been toward these "surprises" being pretty much uniformly bad; very rarely does something happen that makes warming run more slowly than the models suggested. Some people are trying to incorporate that into the models by over-sampling the more pessimistic end of parameter values when doing ensemble modeling. That's controversial.

[–] LeeeroooyJeeenkiiins@hexbear.net 5 points 2 weeks ago (1 children)

how did it ace the season did the speak n spell get the names right

[–] reddit@hexbear.net 4 points 2 weeks ago

Google DeepMind is not an LLM, it's one of the few genuinely impressive technologies to come out of this boom (and, technically, the ML/Big Data boom previously)