Since the 1980’s IBM mainframes have come with a built-in hypervisor that allows you to partition the machine into multiple systems. (As I recall, Amdahl had this feature first.) For over a decade, we’ve run three partitions on the administrative computing system: one partition where we install new versions of the operating system when we receive them from IBM, and make sure it’s working as delivered; a second where we make our local modifications and install third-party applications to test them out; and finally, the production partition that everyone except the systems staff uses. This means that all developer and quality assurance testing occurs in the production partition.
I’ve mentioned this to a few people, but now I’d like to publicly float the idea of breaking up the production partition and spinning out the test and quality assurance environments into their own partitions. The primary advantage of doing so would be a decreased possibility of testing efforts inadvertently affecting production data. (This would particularly apply in batch, where currently developers have to come up with different names for test data sets.) A second advantage would be an enhanced ability to make sure non-production work doesn’t receive an excessive share of the mainframe’s capacity.
The main disadvantage would be slightly more complicated procedures for migrating code and data between environments. Also, setting this up would require some effort, and developers would have to change some of their habits.
So what do y’all think?
It is uncomfortably easy to mess up production data inadvertently, at least in batch. On the other hand, most developers here have already learned how to avoid that (in some cases, it should be admitted, by doing it once and regretting it so much they are careful thereafter). So, the ongoing cost of this is not too big.
So, if the ongoing cost of migrations between environments (something we do quite often) is increased by very much at all, it would probably not be a big tradeoff. By “cost” here I mean in developer time, not money, of course.
On the other hand, if the ongoing cost increase of migrations were negligible, and it’s just the one-time cost of everyone learning a new method, it might be worth it, especially if it helped in balancing prod vs. non-prod load during peak times.
I’m worried that this will turn out to break a lot of utility code. I floated the question in a recent FIS meeting and got a generally negative response — mostly fears that it will break things, and doubts about how compelling the advantages are.
Part of this is that we are now operating in a climate of great uncertainty — what will this move to “open systems” mean in practice? If we are moving away from the mainframe, no one wants to spend a bunch of resources making it better.