question

rickross avatar image
rickross asked

[meta] Specifics of the SQLTeam data merge

We're finally about to merge the data from into Ask.SSC. There has been some good discussion of this before, but now we're getting down to specifics. We could use your help to make sure the bases are well-covered. First, there is an issue of different tags being used to represent basically the same thing. We have uploaded CSV files of all the tags from both SQLTeam and CCS to so nobody needs to endure the aggravation of cutting and pasting pages of web data into a spreadsheet. What we'd like (and we're not exactly sure the optimal way to do it) is to have you guys decide which tags from SQLTeam you would like remapped to existing tags in SSC. We really just need a file that has the tag pairs to show what target tag to use for any original tag that you want to be mapped. If there is no mapping for a given tag, then we'll just preserve the original tag. Second, and significantly more complex, is the question of correctly combining users who have accounts in both communities. Here are some statistics: 1. There are 2664 users in SSC and 1515 in SQLTeam. 2. There are 119 users with the exact same username. 3. 115 users in both sites have the same email. 4. Of these users, 57 also have the same username. 5. There are 29 users that share some authentication key. 6. Of these, 21 have the same username on both sides. 7. 23 Users share both email and authentication key. 8. 19 share it all: auth key, email and username. So, you can see that the overlap is, at best, unclear and doesn't cover most users. We have a strategy in mind to produce a reasonable combination of the user datasets, and we think we have a good idea about how to let users combine accounts in a "self-service" mode for those who end up with multiple accounts. Hernani will discuss the merge strategy more below. Finally, there may be other important issues about merging that we have not considered, so let's get them all on the table in hopes of getting to the most useful and successful outcome. Thanks!
meta-asksscmergesqlteam
12 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
If everyone's ok with it, I'm happy to go through the tags and email back to Rick - rather than lots of people give lots of different views...?
3 Likes 3 ·
Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
Go for it. If we don't like it, we'll scream really loudly. You'll hear it.
3 Likes 3 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@rickross - Well, I'm not sure what I fall into, but I know that my accounts were the same - but now that ask.sqlteam is on OSQA - I can't login to check. I'm pretty sure I had to change my openID in order to be able to use Ask.SSC after the move - so do your matching stats account for that?
1 Like 1 ·
rickross avatar image rickross ♦♦ commented ·
@Matt, the various OpenID providers have differing interpretations of what hashkeys will work when hostnames and urls change. Google is the most widely used and also the most strict, whereas MyOpenID is more lenient. Since the hostname has not actually changed, it shouldn't have been a problem to continue to login on the new OSQA-powered version with the same OpenID that was used before. You personally have the same auth credential on both sites, so we do not anticipate a problem. In general, if there is not an exact match on the auth credentials, there is no match at all. These hashkeys for OpenID are very long and complex. There's no way for us to guess that two different hashkeys actually belong to the same user. PS - The only change in your OpenID from the original SE data dump was a switch from *http:* to *https:* (which seems more appropriate for a secure login credential, anyway.)
1 Like 1 ·
hernani avatar image hernani commented ·
@Matt, I believe you fall in #8.
0 Likes 0 ·
Show more comments
hernani avatar image
hernani answered
I would like to give my suggestions regarding tags. I believe that the ones to be preserved are the ones from ask.sqlservercentral. So besides sql200X to sql-server-200X, the most obvious ones are stored-procedure to stored-procedures and fulltextsearch to full-text.
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@hernani - I think Kev Riley is coming up with a full mapping for you...
0 Likes 0 ·
hernani avatar image hernani commented ·
Ok, thanks.
0 Likes 0 ·
hernani avatar image
hernani answered
We've put a test site with a full merge and very recent data here: Please give your feedback and if there is anything that should be fixed, prior to the definitive merging. EDIT: Please upvote or downvote this answer if you agree/disagree that the merge can proceed.
23 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Mark avatar image Mark commented ·
Hmm... snappy. I like it.
0 Likes 0 ·
ThomasRushton avatar image ThomasRushton ♦♦ commented ·
Login issues. Tried giving it my Google login ID, and it asked me to provide screen name. Gave it my usual screen name, and it's (surprise) already in use... but it won't let me in with that name...
0 Likes 0 ·
hernani avatar image hernani commented ·
@Thomas, forgot to mention that google and yahoo logins won't work, because it's running on a different domain.
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@hernani I have a problem...
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
Check out and then
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
It seems a blockquote has been added in between every paragraph - For some reason there is a 'Grant Fritchey 1' user instead of Grant's rep being added to his. My rep hasn't been added to mine (although, for example, Peso's rep has been added together)
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
Answers / comments seem intact, however...
0 Likes 0 ·
hernani avatar image hernani commented ·
Nice catch @Matt, looks like some markdown went wild.
0 Likes 0 ·
ThomasRushton avatar image ThomasRushton ♦♦ commented ·
@hernani - thanks! Will try again, when I can remember my ordinary SSC login details.
0 Likes 0 ·
hernani avatar image hernani commented ·
Well, users merging is based on: 1. same email 2. some authentication method being shared It's a machine doing the majority of the job, we can expect it to be perfect. However there is a user merging functionality on the way.
0 Likes 0 ·
ThomasRushton avatar image ThomasRushton ♦♦ commented ·
The login using SSC id was a bit flaky (but that might have been me...), and it's not picking up the fact that I already have an account, so have created a new one. "thomasrushton" vs "ThomasRushton". This could get confusing!
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@hernani - it's weird, because I can't find any reference to my SQLTeam.com user - I would have thought it would be 'Matt Whitfield 1' or something... Also, I looked for a question I had answered on Ask.SQLTeam - and found this one - http://ask.sqlteam.com/questions/1460/smo-systemsecuritysecurityexception. I can't find that question on the merge site at all?
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@hernani - http://ask.sqlteam.com/questions/1514/add-more-years-to-mysql-database - can't find the equivalent of that one either? Is the search database up to date on the merge?
0 Likes 0 ·
hernani avatar image hernani commented ·
@Matt, it was merged with yours (same email or auth setting). The question you're looking fo is here:
0 Likes 0 ·
hernani avatar image hernani commented ·
The final merge will have a functionality to redirect old sqlteam stuff to the new urls.
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@hernani - OK, so questions I have are: How come my rep isn't added together? Does that affect any other users? I'm not too bothered about the 71 rep, I'm bothered about some non-understood side effect. Also, how come I couldn't find that question? For example, the second one I looked for 'mysql' - it really looks like search DBs aren't merged - because I got the questions from Ask.SSC returned, but not the answers from Ask.SQLTeam...
0 Likes 0 ·
hernani avatar image hernani commented ·
@Matt, yes, the fts index is not updated with the sqlteam stuff in this test site. It's a bit of a long process but it will be working in the final merge, I've tested it already. And yes, your user really doesn't seem to have updated the rep. Most of them seem to do so though. Have to double check.
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@hernani - Ok, well, when that's resolved, and you're happy with the markdown, then I think we're as good as we will be. I would +1 you, but I'm out of votes. I'm not sure if that's accurate, or I'm just very generous - but that's not an issue for right now...
0 Likes 0 ·
hernani avatar image hernani commented ·
@Matt, not happy with the markdown :) Trying to sort it out, and also the reputation, don't worry.
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@hernani - I'll leave it in your capable hands sir. It's pretty much midnight over here so I'm going for the cider + chocolate + telly combo...
0 Likes 0 ·
hernani avatar image hernani commented ·
Thanks @Matt, just a quick info. The reputation problem is sorted out, I forgot to restart the cache server, so some users were still with the old values.
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@hernani - Hmmm... not sure. It now says my rep is 20,647 on merge, and it should be something like 19,665 (I think)... any ideas there?
0 Likes 0 ·
hernani avatar image hernani commented ·
@Matt, yes, in the proccess I also toke the opportunity to fix some denormalized data caused by some issues here and there.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.