The progress on the restoration was somewhat upended by the currently ongoing issues with the server and the forum, where we're still trying to figure out what's going on.
I've still put some more time into this today, to see if I can get a check based on quote chains implemented, and it's getting there.
View attachment 733346
This is an example here. The dark blue posts are in the Dawn of Civilization thread, the red ones are in the Enhanced User Interface (EUI) thread, the light blue ones are not assigned currently. All posts which are connected are somehow quoting each other. You can see here that e.g. in the case on the right, on the bottom there is post 1616009, which is currently not assigned to a thread, but is quoting (or is getting quoted, direction does not matter) by a post in the EUI thread, which is quoted by another post in the EUI thread etc. So... good chance that post 1616009 belongs in the EU thread.
In some cases, like on the left, there's nothing to be gained, all quoted posts are already in the right thread.
In the case 2nd to the left, none of the posts is assigned, but for longer chains I should check if I can manually see that, assign one, and then get the rest automatically assigned.
Now I need to iteratively go through the data, expand the chains, then check again to collect usernames per thread, see if a username can be clearly assigned to a thread, see if this adds a new chain, expand the chain, repeat.
This will only work for posts up to 2024 incl., because for afterwards we have one of the big Civ7 threads, which is not assigned, and attracted users from everywhwere. From 2024 and before we have only the 2 threads, Dawn Of Civ and EUI, so that is easy. Well, "easy".
In total still 2778 posts need to be somehow assigned (or they might stay in our recycle bin).
1004 are from up to incl. 2024, for which approx 400 can already be assigned based on username. I hope to maybe get to 700, which will leave us with a leftover of 2000 posts.
Then... no clue

.