That’s interesting, but I’m suspicious of it for some reason.

]]>https://www.annualreviews.org/doi/full/10.1146/annurev-statistics-060116-054026

esp Fig 1. ]]>

Yeah. Socrates once complained that the problem with books is that you can’t ask a book questions. But on a blog you *can* ask questions — and other people can read the answers! This is what I like about blogs.

Yes, but they do this when they *start* with the coordinates. Here the coordinates are derived at the end of a series of steps; I think that's adding to the confusion. I see why you did it this way, but it is an unusual order to do things.

Anyway, I'm really hoping that people who see these comments might think ‹Ah, that's why I'm confused, but now I understand.› rather than that you'll change the notation, especially since I don’t have a natural alternative to suggest.

]]>Everyone in differential geometry uses as coordinates of a point called everyone in physics uses as coordinates of a position called etc. This is confusing when you first see it, and you might feel tempted to write for the coordinate function evaluated at the point But it’s good to get used to.

I’m sorry my exposition seemed convoluted. Maybe it’s because I was trying to explain two subjects in one blog post: statistical manifolds and thermodynamics. They’re closely related. In a statistical manifold each point labels a probability distribution. In thermodynamics each point of a manifold labels a probability distribution that maximizes entropy subject to constraints on some observables, and the expected values of these observables serve as coordinates.

I didn’t have a whole lot I wanted to say about statistical manifolds this time, except as a lead-in to thermodynamics. So I just laid out the basic formalism and then took a hard right turn into thermodynamics. If I ever turn this stuff into a book I’ll try to give the readers a more gentle ride.

]]>I think that it would help to use different symbols for (a point of the manifold ) and (a coordinate function on ). Because even though can be written as a tuple using the coordinates , still comes well before in the development.

]]>Francis wrote:

I guess I’m confused about what the are supposed to be: you defined them starting from and the distribution, but now you use them to define another distribution.

No, there’s only one distribution in this story: or more precisely, one map sending each point of a manifold to a probability distribution

Starting from this and some observables I defined the function

to be the expected value of in the distribution :

Then I assumed that the functions are a coordinate system on (and I explained why this is easy to achieve).

Then I made a big extra assumption: is the probability distribution with the *largest possible entropy* such that

for all *i*. In other words: if you tell me the values of the coordinates then is not just any old probability distribution having these values as the expected values of the observables It’s the probability distribution with *the largest possible entropy* having these values as the expected values of the observables

I’m not changing here, I’m just making an extra assumption about it. This is a common assumption in thermodynamics.

Without this extra assumption there’s nothing very exciting we can do until someone tells us a formula for . But *with* this extra assumption I can show that must obey this equation:

(I defined and in the post.)

I will actually show this equation next time.

I hope the logic is clearer now. If there’s something puzzling still, just ask.

]]>I guess I’m confused about what the q_{i} are supposed to be: you defined them starting from A_{i} and the distribution, but now you use them to define another distribution. So do we assume that we have coordinates q_{i} on Q and observables A_{i} on \Omega already, and then consider probability distributions for which the expectatin of A_{i} is q_{i}?

]]>Hi! That’s a good question. I will answer it someday, as I keep discussing the analogy between classical mechanics and information geometry. There are actually a few different possibilities, depending on what we take as the analogue of time.

]]>