I have to build a database migration that generates user handles. The user handles are unique within an organization. The user can change them. The auto generated handles are either the first name + last name, or the business name depending on which user type it is. Unless it would be a duplicate. Duplicates auto increment if the handle is taken. The character limit for a user handle is the same length as first name plus last name so I have to check for possible overflow if I add digits. I also have to see if the generated name is in the DB already because a user could have custom entered the result of the auto generation.

This has to be programmed async. The DB driver is using a transaction but multiple calls have to be made to check if the generated handle exists for that organization. Also I have to check the migration script itself for possible duplicates. 3/4 of the users have a handle and with the scale there will definitely be duplicate names.

My idea is if there is a collision, use a UUID and let the users pick something nicer next time they log in. Business says “Reeeeeee!!!! The users shouldn’t see a UUID!!! You can do this!!!” Absurd uniqueness requirements. Absurd backfill procedure. Absurd business rules.

  • 2
    i would implemtent it this way:
    if the user logs in first i would create the handle with the standard schema.
    if an duplicate error is raised i would add an uuid/random number and an truncation to the maximum length. i assume there are no duplicates with the longest names
    If an user wants to change his/her handle if an requirement fails i would say this in an error message.

    If they really want numbered handles create a new table that has the unnumbered handles with and counter and where the application holds a row lock.
    if the amount of space for userhandles is really thight you can try to transform the number to base 36(0-9a-z)
  • 1
    @stop What you are suggesting is very close to the things I have been building. Also kudos on a well constructed comment. 😀

    The main issue is that it can’t be run by the user. Instead it has to be run in the devops pipeline. For long names we do have people in non English countries that are filling the name length. Then those same users can also be represented as an agent organization where the organization name is identical to the person’s name.

    Every uniqueness check to prod is slow and requires async coordination and handling.

    When I looked at prod this will be back filled for almost 70k users using 7 languages. So plenty of room for super weird data issues.
Add Comment