2 months Ago

Schema migration in Quino 1.13

Published by marco on in Programming

Quino is a metadata framework for .NET. It provides a means of defining an application-domain model in the form of metadata objects. Quino also provides many components and support libraries that work with that metadata to automate many services and functions. A few examples are an ORM, schema migration, automatically generated user interfaces and reporting tools.

The schema-migration tool

The component we’re going to discuss is the automated schema-migration for databases. A question that recently came up with a customer was: what do all of the options mean in the console-based schema migrator?

Here’s the menu you’ll see in the console migrator:

Advanced Options
(1) Show migration plan
(2) Show significant mappings
(3) Show significant mappings with unique ids
(4) Show all mappings
(5) Show all mappings with unique ids

Main Options
(R) Refresh status
(M) Migrate database
(C) Cancel

The brief summary is:

  • The only action that actually makes changes is (M)
  • Option (1) is the only advanced option you will every likely use; use this to show the changes that were detected

The other advanced options are more for debugging the migration recommendation if something looks wrong. In order to understand what that means, we need to know what the migrator actually does.

 Schema migration overview

  1. Provide the application model as input
  2. Import a model from the database as input
  3. Generate a mapping between the two models
  4. Create a migration plan to update the database to reflect the application model
  5. Generate a list of commands that can be applied to the database to enact the plan
  6. Execute the commands against the database

The initial database-import and final command-generation parts of migration are very database-specific. The determination of differences is also partially database-specific (e.g. some databases do not allow certain features so there is no point in detecting a difference that cannot ever be repaired). The rest of the migration logic is database-independent.

Gathering data for migration

The migrator works with two models: the target model and a source model

  • The target model is provided as part of the application and is usually loaded from a core assembly. The source model is imported from the database schema by the “import handler”

Given these two models, the “mapping builder” creates a mapping. In the current implementation of Quino, there is no support for allowing the user to adjust mapping before a migration plan is built from it. However, it would be possible to allow the user to verify and possibly adjust the mapping. Experience has shown that this is not necessary. Anytime we thought we needed to adjust the mapping, the problem was instead that the target model had been configured incorrectly. That is, each time we had an unexpected mapping, it led us directly to a misconfiguration in the model.

The options to show mappings are used to debug exactly such situations. Before we talk about mapping, though, we should talk about what we mean by “unique ids”. Every schema-relevant bit of metadata in a Quino model is associated with a unique id, in the form of a Guid and called a “MetaId” in Quino.

Importing a model from a database

What happens during when the import handler generates a model?

The importer runs in two phases:

  1. Extract the “raw model” from the database schema
  2. Enhance the “raw model” with data pulled from the application-specific Quino metadata table in the same database

A Quino application named “demo” will have the following schema:

  • All modeled tables are named “demo__*”
  • The metadata table is named “demometadata__elementdescription”

The migrator reads the following information into a “raw model”

  • Tables => MetaClasses
  • Fields/Columns => MetaProperties
  • Indexes => MetaIndexes
  • Foreign Keys => MetaPaths

If there is no further information in the database, then the mapper will have to use the raw model only. If, however, the database was created or is being maintained by Quino, then there is additional information stored in the metadata table mentioned above. The importer enhanced the raw model with this information, in order to improve mapping and difference-recognition. The metadata table contains all of the Quino modeling information that is not reflected in a standard database schema (e.g. the aforementioned MetaId).

The data available in this table is currently:

  • SchemaIdentifier: the identifier used in the raw model/database schema
  • Identifier: the actual identifier of the metadata element that corresponds to the element identified by the SchemaIdentifier
  • MetaId: the unique id for the metadata element
  • ObjectType: the type of metadata (one of: class, property, index, path, model)
  • ParentMetaId: the unique id of the metadata element that is the logical parent of this one; only allowed to be empty for elements with ObjectType equal to “model”
  • Data: Custom data associated with the element, as key/value pairs
  • DataVersion: Identifies the format type of the “Data” element (1.0.0.0 corresponds to CSV)

For each schema element in the raw model, the importer does the following:

  1. Looks up the data associated with that SchemaIdentifier and ObjectType (e.g. “punchclock__person” and “class”)
  2. Updates the “Identifier”
  3. Sets the “MetaId”
  4. Loads the key/value pairs from the Data field and applies that data to the element

Generating a mapping

At this point, the imported model is ready and we can create a mapping between it and the application model. The imported model is called the source model while the application model is called the target model because we’re migrating the “source” to match the “target”.

We generate a mapping by iterating the target model:

  1. Find the corresponding schema element in the source model using MetaIds[1]
  2. If an element can be found, create a mapping for those two elements
  3. If no element can be found, create a mapping with the target element. This will cause the element to be created in the database.
  4. For all elements in the source model that have no corresponding element in the target model, create a mapping with only the source element. This will cause the element to be dropped from the database.

Creating a migration plan

The important decisions have already been made in the mapping phase. At this point, the migrator just generates a migration plan, which is a list of differences that must be addressed in order to update the database to match the target model.

  • If the mapping has a source and target element
    • Create a difference if the element has been renamed
    • Create a difference if the element has been altered (e.g. a property has a different type or is now nullable; an index has new properties or is no longer unique; etc.)
    If the mapping has only a source, generate a difference that the element is unneeded and should be dropped.
  • If the mapping has only a target, generate a difference that the element is missing and should be created.

This is the plan that is shown to the user by the various migration tools available with Quino.[2]

The advanced console-migrator commands

At this point, we can now understand what the advanced console-migrator commands mean. Significant mappings are those mappings which correspond to a difference in the database (create, drop, rename or alter).

  • Show significant mappings: show significant mappings to see more detail about the names on each side
  • Show significant mappings with unique ids: same as above, but also include the MetaIds for each side. Use this to debug when you suspect that you might have copy/pasted a MetaId incorrectly or inadvertently moved one.
  • Show all mappings: Same detail level as the first option, but with all mappings, including those that are 100% matches
  • Show all mappings with unique ids: same as above, but with MetaIds

As already stated, the advanced options are really there to help a developer see why the migrator might be suggesting a change that doesn’t correspond to expectations.

Generating commands for the plan

At this point, the migrator displays the list of differences that will be addressed by the migrator if the user chooses to proceed.

What happens when the user proceeds? The migrator generates database-specific commands that, when executed against the database, will modify the schema of the database.[3]

Commands are executed for different phases of the migration process. The phases are occasionally extended but currently comprise the following.

  • Initialize: perform any required initialization before doing anything to the schema
  • DropConstraintsAndIndexes: drop all affected constraints and indexes that would otherwise prevent the desired modification of the elements involved in the migration.
  • AddUpdateOrRenameSchema: Create new tables, columns and indexes and perform any necessary renaming. The changes in this phase are non-destructive
  • UpdateData: Perform any necessary data updates before any schema elements are removed. This is usually the phase in which custom application code is executed, to copy existing data from other tables and fields before they are dropped in the next phase. For example, if there is a new required 1–1 relation, the custom code might analyze the other data in the rows of that table to determine which value that row should have for the new foreign key.
  • DropSchema: Drop any unneeded schema elements and data
  • CreatePrimaryKeys: Create primary keys required by the schema. This includes both new primary keys as well as reestablishing primary keys that were temporarily dropped in the second phase.
  • CreateConstraintsAndIndexes: Create constraints and indexes required by the schema. This includes both new constraints and indexes as well as reestablishing constraints and indexes that were temporarily dropped in the second phase.
  • UpdateMetadata: Update the Quino-specific metadata table for the affected elements.

Executing the migration plan

The commands are then executed and the results logged.

Afterward, the schema is imported again, to verify that there are no differences between the target model and the database. In some (always rarer) cases, there will still be differences, in which case, you can execute the new migration plan to repair those differences as well.

In development, this works remarkably well and often, without further intervention.

Fixing failed migrations

In some cases, there is data in the database that, while compatible with the current database schema, is incompatible with the updated schema. This usually happens when a new property or constraint is introduced. For example, a new required property is added that does not have a default value or a new unique index is added which existing data violates.

In these cases, there are two things that can be done:

  • Either the database data is cleaned up in a way that makes it compatible with the target schema[4]
  • Or the developer must add custom logic to the metadata elements involved. This usually means that the developer must set a default value on a property. In rarer cases, the developer must attach logic to the affected metadata (e.g. the property or index that is causing the issue) that runs during schema migration to create new data or copy it from elsewhere in order to ensure that constraints are satisfied when they are reestablished at the end of the migration.

In general, it’s strongly advised to perform a migration against a replica of the true target database (e.g. a production database) in order to guarantee that all potential data situations have been anticipated with custom code, if necessary.

Quino Migration versus EF Migrations

It’s important to point out that Quino’s schema migration is considerably different from that employed by EF (which it picked up from the Active Migrations in Ruby, often used with Ruby on Rails). In those systems, the developer generates specific migrations to move from one model version to another. There is a clear notion of upgrading versus downgrading. Quino only recognizes migrating from an arbitrary model to another arbitrary model. This makes Quino’s migration exceedingly friendly when moving between development branches, unlike EF, whose deficiencies in this area have been documented.


[1] The default is to use only MetaIds. There is a mode in which identifiers are used as a fallback but it is used only for tools that import schemas that were not generated by Quino. Again, if the Quino metadata table hasn’t been damaged, this strict form of mapping will work extremely well.
[2] The Winform and Web user interfaces for Quino both include built-in feedback for interacting with the schema migration. There are also two standalone tools to migrate database schemas: a Winform application and a Windows console application.
[3] The form of these commands is currently a mix of SQL and custom C# code. A future feature of the migration will be to have all commands available as SQL text so that the commands, instead of being executed directly, could be saved as a file and reviewed and executed by DBAs instead of letting the tool do it. We’re not quite there yet, but proceeding nicely.
[4] This is generally what a developer does with his or her local database. The data contained therein can usually be more or less re-generated. If there is a conflict during migration, a developer can determine whether custom code is necessary or can sometimes determine that the data situation that causes the problem isn’t something that comes up in production anyway and just remove the offending elements or data until the schema migration succeeds.

EF Migrations troubleshooting

Published by marco on in Programming

The version of EF Migrations discussed in this article is 5.0.20627. The version of Quino is less relevant: the features discussed have been supported for years. For those in a hurry, there is a tl;dr near the end of the article.

We use Microsoft Entity Framework (EF) Migrations in one of our projects where we are unable to use Quino. We were initially happy to be able to automate database-schema changes. After using it for a while, we have decidedly mixed feelings.

As developers of our own schema migration for the Quino ORM, we’re always on the lookout for new and better ideas to improve our own product. If we can’t use Quino, we try to optimize our development process in each project to cause as little pain as possible.

EF Migrations and branches

We ran into problems in integrating EF Migrations into a development process that uses feature branches. As long as a developer stays on a given branch, there are no problems and EF functions relatively smoothly.[1]

However, if a developer switches to a different branch—with different migrations—EF Migrations is decidedly less helpful. It is, in fact, quite cryptic and blocks progress until you figure out what’s going on.

Assume the following not-uncommon situation:

  • The project is created in the master branch
  • The project has an initial migration BASE
  • Developers A and B migrate their databases to BASE
  • Developer A starts branch feature/A and includes migration A in her database
  • Developer B starts branch feature/B and includes migration B in his database

We now have the situation in which two branches have different code and each has its own database schema. Switching from one branch to another with Git quickly and easily addresses the code differences. The database is, unfortunately, a different story.

Let’s assume that developer A switches to branch feature/B to continue working there. The natural thing for A to do is to call “update-database” from the Package Manager Console[2]. This yields the following message—all-too-familiar to EF Migrations developers.

 EF Migrations pending-changes warning

“Unable to update database to match the current model because there are pending changes and automatic migration is disabled. Either write the pending changes to a code-based migration or enable automatic migration. […]”

This situation happens regularly when working with multiple branches. It’s even possible to screw up a commit within a single branch, as illustrated in the following real-world example.

  • Add two fields to an existing class
  • Generate a migration with code that adds two fields
  • Migrate the database
  • Realize that you don’t need one of the two fields
  • Remove the C# code from the migration for that field
  • Tests run green
  • Commit everything and push it

As far as you’re concerned, you committed a single field to the model. When your co-worker runs that migration, it will be applied, but EF Migrations immediately thereafter complains that there are pending model changes to make. How can that be?

Out-of-sync migrations != outdated database

Just to focus, we’re actually trying to get real work done, not necessarily debug EF Migrations. We want to answer the following questions:

  1. Why is EF Migrations having a problem updating the schema?
  2. How do I quickly and reliably update my database to use the current schema if EF Migrations refuses to do it?

The underlying reason why EF Migrations has problems is that it does not actually know what the schema of the database is. It doesn’t read the schema from the database itself, but relies instead on a copy of the EF model that it stored in the database when it last performed a successful migration.

That copy of the model is also stored in the resource file generated for the migration. EF Migrations does this so that the migration includes information about which changes it needs to apply and about the model to which the change can be applied.

If the model stored in the database does not match the model stored with the migration that you’re trying to apply, EF Migrations will not update the database. This is probably for the best, but leads us to the second question above: what do we have to do to get the database updated?

Generate a migration for those “pending changes”

The answer has already been hinted at above: we need to fix the model stored in the database for the last migration.

Let’s take a look at the situation above in which your colleague downloaded what you thought was a clean commit.

From the Package Manager Console, run add-migration foo to scaffold a migration for the so-called “pending changes” that EF Migrations detected. That’s interesting: EF Migrations thinks that your colleague should generate a migration to drop the column that you’d only temporarily added but never checked in.

That is, the column isn’t in his database, it’s not in your database, but EF Migrations is convinced that it was once in the model and must be dropped.

How does EF Migrations even know about a column that you added to your own database but that you removed from the code before committing? What dark magic is this?

The answer is probably obvious: you did check in the change. The part that you can easily remove (the C# code) is only half of the migration. As mentioned above, the other part is a binary chunk stored in the resource file associated with each migration. These BLOBS are stored in the table _MigrationHistory table in the database.

 DB Migration History table Migration data is stored as binary data

How to fix this problem and get back to work

Here’s the tl;dr: generate a “fake” migration, remove all of the C# code that would apply changes to the database (shown below) and execute update-database from the Package Manager Console.

 An empty migration

This may look like it does exactly nothing. What actually happens is that it includes the current state of the EF model in the binary data for the last migration applied to the database (because you just applied it).

Once you’ve applied the migration, delete the files and remove them from the project. This migration was only generated to fix your local database; do not commit it.

Everything’s cool now, right?

Applying the fix above doesn’t mean that you won’t get database errors. If your database schema does not actually match the application model, EF will crash when it assumes fields or tables are available which do not exist in your database.

Sometimes, the only way to really clean up a damaged database—especially if you don’t have the code for the migrations that were applied there[3]—is to remove the misapplied migrations from your database, undo all of the changes to the schema (manually, of course) and then generate a new migration that starts from a known good schema.

Conclusions and comparison to Quino

The obvious answer to the complaint “it hurts when I do this” is “stop doing that”. We would dearly love to avoid these EF Migrations-related issues but developing without any schema-migration support is even more unthinkable.

We’d have to create upgrade scripts manually or would have to maintain scripts to generate a working development database and this in each branch. When branches are merged, the database-upgrade scripts have to be merged and tested as well. This would be a significant addition to our development process, has maintainability and quality issues and would probably slow us down even more.

And we’re certainly not going to stop developing with branches, either.

We were hoping to avoid all of this pain by using EF Migrations. That EF Migrations makes us think of going back to manual schema migration is proof that it’s not nearly as elegant a solution as our own Quino schema migration, which never gave us these problems.

Quino actually reads the schema in the database and compares that model directly against the current application model. The schema migrator generates a custom list of differences that map from the current schema to the desired schema and applies them. There is user intervention but it’s hardly ever really required. This is an absolute godsend during development where we can freely switch between branches without any hassle.[4]

Quino doesn’t recognize “upgrade” versus “downgrade” but instead applies “changes”. This paradigm has proven to be a much better fit for our agile, multi-branch style of development and lets us focus on our actual work rather than fighting with tools and libraries.


[1] EF Migrations as we use it is tightly bound to SQL Server. Just as one example, the inability of SQL Server to resolve cyclic cascade dependencies is in no way shielded by EF Migrations. Though the drawback originates in SQL Server, EF Migrations simply propagates it to the developer, even though it purports to provide an abstraction layer. Quino, on the other hand, does the heavy lifting of managing triggers to circumvent this limitation.
[2] As an aside, this is a spectacularly misleading name for a program feature. It should just be called “Console”.
[3] I haven’t ever been able to use the Downgrade method that is generated with each migration, but perhaps someone with more experience could explain how to properly apply such a thing. If that doesn’t work, the method outlined above is your only fallback.
[4] The aforementioned database-script maintenance or having only very discrete schema-update points or maintaining a database per branch and switching with configuration files or using database backups or any other schemes that end up distracting you from working.

3 months Ago

When you really start learning

Published by marco on in Quotes

“The paradox of education is precisely this—that as one begins to become conscious, one begins to examine the society in which he is being educated.”

iTunes: another tale of woe in UX

Published by marco on in Technology

I know that pointing out errors in iTunes is a bit passé but Apple keeps releasing new versions of this thing without addressing the fundamental problems that it has as a synchronization client.

The software has to synchronize with hardware from only one manufacturer—the same one that makes iTunes. I’ll leave off complaints about the horrific, very old and utterly non-scaling UI and just regale you with a tale of a recent interaction in which I restored my phone from a backup. In that sense, it’s a “user experience”.

In this tale, we will see that two of the main features of the synchronization part of the iTunes software—backup and sync—seem to be utterly misinterpreted.

Spoiler alert: it all works out in the end, but it’s mind-boggling that this is the state of Apple’s main software after almost 15 years.[1]

10 million new iPhones were sold over the weekend. Their owners will all have the pleasure of working with this software.

Restore from backup

Me: *attaches phone*
iTunes: Restore from backup?
Me: Sure!
iTunes: *shows almost full iPhone* There you go!
Me: Thanks! That was fast!
Me: Wait…my phone is empty (no apps, no music, no contacts)
iTunes: *blushes* Yeah, about that…
Me: *reconnects phone*
iTunes: *shows nearly empty iPhone* What’s the problem?
Me: Seriously, RESTORE FROM BACKUP (select EXACT SAME backup as before)
iTunes: On it! Sir, yes sir!
Me: OK. Apps are back; contacts are back. No music, iTunes? What part of the word “backup” is causing difficulties here?
iTunes: *blushes* (again) Ummm, dunno what happened there
Me: Fine. It was randomly selected anyway.
Me: Select random music from this playlist
iTunes: Here ya go!
Me: Sync
iTunes: Nothing to do
Me: Sync
iTunes: Seriously, dude, there’s nothing to do
Me: SYNC
iTunes: Done
Me: No music on phone. Do you understand the word “sync” differently as well? You know, like how you have trouble with the word “backup”?
iTunes:
Me: *notices that size of playlist exceeds capacity of iPhone*
Me: that’s 17GB of music. For a 16GB iPhone.
iTunes: Yep! Awesome, right?
Me: Is that why you won’t sync?
iTunes: Error messages are gauche. I don’t use them. Everything is intuitive.
Me: Fine. Reserve space when selecting music: 1GB (don’t need more extra space than that)
iTunes: NP! Here’s 15GB of music.
Me: Wait, what? You’re supposed to leave 1GB empty *of the available space* not the *total size of the device*
iTunes: Math is hard. … You do it.
Me: Fine. Reserve 4.2GB?
iTunes: Done.
Me: Now I have a 28GB playlist.
iTunes: *pats self on back*
Me: Reserve 3.2GB … and “delete all existing” and “replace”? Now does it work?
iTunes: 9GB for you
Me: *tweaks settings 2 or 3 more times*
iTunes: 10.5GB
Me: Perfect. That was totally easy.
Me: Sync
iTunes: On it! *hums to self*
Me: Why are you only syncing 850 songs when the playlist has 1700 of them?
iTunes: *continues humming*
Me: Fine. *wanders away*
iTunes: Done
Me: Sync
iTunes: *syncing 250 more songs*
Me: What the hell?
iTunes: Done.
Me: Sync
iTunes: *syncs remaining songs*
Me: This is ridiculous
iTunes: Done


[1] It has been pointed out to me that I am using this software in a somewhat archaic way: to wit, I am not allowing iTunes to synchronize all of my data to the cloud first. Had I done that, it is claimed, I would have had fewer problems. I am, however, skeptical. I think that a company that can’t even get local sync working properly after 15 years has no business getting any of my data.

An introduction to PowerShell

Published by marco on in Programming

On Wednesday, August 27th, Tymon gave the rest of Encodo[1] a great introduction to PowerShell. I’ve attached the presentation but a lot of the content was in demonstrations on the command-line.

  1. Download the presentation
  2. Unzip to a local folder
  3. Open index.html in a modern web browser (Chrome/Opera/Firefox work the best; IE has some rendering issues)

We learned a few very interesting things:

  • PowerShell is pre-installed on every modern Windows computer
  • You can PowerShell to other machines (almost like ssh!)
  • Windows developers should definitely learn how to use PowerShell.
  • Unix administrators who have to work on Windows machines should definitely learn how to use PowerShell. The underlying functionality of the operating system is much more discoverable via command line, get-command and get-member than the GUI.
  • You should definitely install ConEmu
  • When running ConEmu, make sure that you start a PowerShell session rather than the default Cmd session.
  • If you’re writing scripts, you should definitely install and use the ISE, which is an IDE for PowerShell scripts with debugging, code-completion, lists of available commands and much better copy/paste than the standard console.
  • The PowerShell Language Reference v3 is a very useful and compact reference for beginners and even for more advanced users

ConEmu Setup

The easiest way to integrate PowerShell into your workflow is to make it eminently accessible by installing ConEmu. ConEmu is a Windows command-line with a tabbed interface and offers a tremendous number of power-user settings and features. You can tweak it to your heart’s content.

 ConEmu in Quake modeI set mine up to look like the one that Tymon had in the demonstrations (shown on my desktop to the right).

  1. Download ConEmu (CodePlex); I installed version 140814, the most recent version marked as “beta”. There is no official release yet, but the software is quite mature.
  2. Install it and run it. I didn’t allow the Win + Num support because I know that I’d never use it. YMMV and you can always change your choice from the preferences.
  3. Show the settings to customize your installation. There are a ton of settings, so I listed the ones I changed below.
  4.  Set the window size to something a bit larger than the standard settings, especially if you have a larger monitor. I use 120 × 40.
  5.  Choose the color scheme you want to use. I’m using the standard PowerShell colors but a lot of popular, darker schemes are also available (e.g. Monokai).
  6.  Check out the hotkeys and set them up accordingly. The only key I plan on using is the one to show ConEmu. On the Swiss-German keyboard, it’s Ctrl + ¨.
  7.  The default console is not transparent, but there are those of us who enjoy a bit of transparency. Again, YMMV. I turned it on and left the slider at the default setting.
  8.  And, finally, you can turn on Quake-style console mode to make it drop down from the top of your primary monitor instead of appearing in a free-floating window.


[1] and one former Encodo employee—hey Stephan!