Ah, this modifies what I just said.
Perhaps there is a software solution for doing it in batches, but yes, I see how the confirmation system needs to be worked out first.
The main issue is that there is no way to notify users that their accounts are somewhere and that they can create a password, because we (of course) don't have access to email addresses. I am thinking that any users created before the import will easily get their posts, and people who notice afterwards we'll have to do manually on-request.
How about a welcome to forum thread so people can introduce themselves and also would be easy for us to know who is registered at which of the available forums?
There is currently a global announcement across the forum where people are saying hi (http://fextralife.co...bioware-forums/) - I can repurpose it for introductions as well so it's visible on all categories
What is the overall strategy for account/user name mapping? As I see it, there are several cases to consider:
1) Non-BSN pre-existing fextra user name that collides with BSN user name
2) BSN user name that is not yet registered in fextra
3) BSN user name that is dormant and will never be registered at fextra (some of these are quite important, like Bioware employees that used to post good stuff)
4) BSN user that is registered at fextra, but their fextra name is not an exact string match with original BSN user name (they might have come to the two sites independently at different points in time, but are the same person).
5) BSN and fextra names are perfect match and are the same person (the happy case)
For the sake of legacy/archival content, I don't think the full profile details of a user are that important. Minimally, we just need to know that post A and reply B were two different people. And when reply C quotes reply B, we know who is being quoted. And that all posts by A can be browsed (currently this can be done from the Profile page). All the rest isn't essential. I wouldn't even expect for number of views or likes to be preserved.
It's also important for the search-by-name function to work across legacy content. Use case: I remember that Patrick Weekes posted something interesting in 2014, but that's all I remember about it. I should be able to search successfully given just that information.
Indeed, this is one of the challenges of importing!
1) We will find these programatically on import, then match and disambiguate these with a symbol. Merge if same person (ie. Me)
2) I'm looking at possibly creating a "ghost account" that isn't really an account but holds the posts of say Person A on Bioware. This person then realizes they want to sign up, do so normally, then message me and confirm their identity, and I assign them all posts of Person A
3) Possibly it will be a simple username display that is not actually an account.
4) We will find these programatically and match them (for example, people with spaces have probably become user_name)
5) Easy peasy
Indeed as you say the naming is important for searches, so I'll try and find a good way to make these "ghost" users display something useful.
Separate question, will you automate BBCode conversion during the scrape, or do that as a post-process step?
Fortunately, they are mostly minor edits. Inline images, spoilers, quotes, font sizes and emoticons where the biggest syntax issues that I saw.
Details here:
http://fextralife.co...opy-paste-test/
Yes I plan on converting everything to bbcode before the import. Will probably use something like html>bbcode which means not everything will display well, but it should save a lot of work.
Progress Update:
The Mass Effect forum is enormous. My computer nearly exploded. But it's hanging in there!