I don't remember such details, but as you suggest, it is a healthy kind of compression.
I address it through merging the lower-level memories into more abstracted ones through a temporal hierarchical filesystem. So, days -> months -> quarters -> years. Each time scale focuses on a more "useful" context since uncertain/contradictory information does not survive as it goes up in abstraction.
For example, A day-level memory might be: "The user learned how to divide 314 by 5 with long division on Jan 3rd 2017."
A year-level memory might be: "The user progressed significantly in mathematics during elementary school."
From the perspective of the LLM, it is easier to access the year-level memories because it requires fewer "cd" commands, and it only dives down into lower levels when necessary.