Storage Space

Jan 31, 2010 15:04

I've been trying to come up with a better way to persist MV3D state info. Basically, I need a solution that meets the following:

  • Synchronous when saving data
  • Transactional
  • Extremely Fast when saving data
  • Maintains data consistency
  • Supports queries
  • Externally available (i.e. you can view/modify/query the data outside of MV3D)
  • Low CPU overhead

    The current solution does support most of those. The four it's probably worst about are queries, data consistency, CPU overhead, and external availability. It's also just kind of wacky. You define a list of properties in your class that should be stored along with which of those properties you'd like to generate a search index for. When it's time to save an object, it generates a uuid for it, then it stuffs all the properties you defined into a special object, pickles it, and then uses Axiom to persist it to sqlite. It uses index objects stored in Axiom which match the uuid with the value of the property in the object. Querying is also a little odd since you use magical properties on a query object (q.a == 12). Unfortunately, the object type you are querying for may not have an a attribute since the query object doesn't really care. Another interesting part of the system is that it requires all servers to store their data locally. I'm not sure how I feel about this since some MV3D data really isn't useful to other servers-- at least as the design stands now. However, it does require that there be two servers for every item stored in order to recover from a catastrophic failure of a single server.

    I've tried in the past to go with a completely Axiom based solution, but that didn't work well because of the restrictions Axiom puts on any classes you mark up as Items that can be persisted. One thing I'd been thinking about was a dual object approach where you have the in memory object, and when it needs to persist, it stores all its persistable attributes into a specific Axiom class. This system could even use powerUps. This had a couple of downsides, one of which being that you had to define twice as many classes. It also ends up that you'd have to define your datastructure twice as well. The other option is to have a utility to generate that extra code, but I'm not a big fan of code generation, so I'd like to avoid that.

    What I'm currently thinking of is adding a new service type to MV3D that would generally act like a data access layer (DAL). You'd mark up classes in Python with which attributes should be stored-- and I'm thinking of combining this with the attributes that are sent over the network. In order to persist an object that had this mark up, you'd hand it over to this service. It'd then store the marked up attributes and return a unique ID you can use to retrieve the object later. While technically at this point, it really doesn't matter where or how the data is stored, I'm going to go into that a bit. There would be several low level functions: register schema, upgrade schema, add object, update object, get object, and query. Basically, schema in this sense is the markup of a particular class which defines attributes to store and what type they are. For SQL based stores, this will be directly associated with a table. Schemas will be versioned so that when a newer version is registered, all data is upgraded. I see this happening by renaming the existing table, creating a new one with the new schema, and running code on each element to upgrade it.

    After the schema is created, adding an object would just be a SQL insert and updating would be similar. Both of those could return the primary key of the row as an ID. Adding the primary key to the object's class and the service's location would give you a method of retrieving that object from anywhere. Querying for the user could make use of the class markup objects to give them the ability to do something like: store.query(Person, Person.name == "mike").

    Looking at the requirements I mentioned, the major ones that it fails are being synchronous and maintaining data consistency. I put synchronous when saving up there because I like to be able to wrap methods that modify data in a @autoStore decorator which stores the object after the method runs. If this were async, either that function would have to return a deferred, or you'd lose data consistency if there was a failure. It would also be sweet to be able to define the markup classes as descriptors which would a) store the object and b) update any network clients of the changes. A possible solution to this would be to strategically limit the remotely available functionality of the service to the low level operations I mentioned previously. Then to make them have nothing to do with converting objects into persistable data. This would require a local object/service to interact with the remote store. The local code could build a queue of transactions to send to the master store, and this queue could be persisted via sqlite. This way, the data would be synchronously stored to disk locally and then sent off to the remote store whenever.

    One other issue here is that theoretically multiple servers could access the same object in the store at the same time. Yes, that sounds like a feature, but the aforementioned queue causes some issues with doing this. The data in the master store may not be 100% up to date. What might be interesting is to create a locking mechanism whereby when loading an object from the store, you acquire a lock on it. The hard part will be making sure that the lock goes away whenever you stop using the object and that failure conditions (such as your server crashing while holding locks on multiple objects) are properly handled.

    Another issue is what to do if the store rejects a change that seemed like it would succeed locally. If we go with writes that return immediately and don't wait for success, it would already be too late to tell the original caller that something went wrong. What are some of the reasons this would happen?

  • Remote storage server is down. Ok, just keep it in the queue until it's back up.
  • Remote storage server is out of space. Keep it in the queue is probably best.
  • Schema mismatch or invalid data. This seems like the worst failure mode.

    What do we do with a schema mismatch or invalid data? This indicates a fairly serious problem, so maybe it deserves a serious resolution-- revert the object and all other objects related to the transaction that failed both on the remote store and in memory. That still doesn't seem like a great way to resolve the issue, but it would maintain data consistency.

    All in all, this seems like it'd be pretty challenging to implement, and it'd mean that I'd be maintaining a DAL/ORM as opposed to using a pre-existing one. I'm generally against re-inventing the wheel like that, but all the Python ORMs I know about are designed for webapps and don't translate well to MMOGs.

    Really at this point, I'm looking for someone to talk me out of this and tell me why it's a horrible idea. Otherwise, I might just be crazy enough to try it out. One thing that's bugging me is that the current mechanism works. I haven't had any lost data issues or anything; however, I've found that sometimes it's just easier to blow away the whole store than to say revert a change I made.

    At the very least, I like how the new system abstracts out the DAL to a service that can optionally be on a remote server. This makes it so the underlying persistence technology can be pretty much anything. Another major benefit would be the ability to freeze objects by storing them and stopping simulation of them. Then you could unfreeze them later on a different server. What's this good for? Well, making it so your character leaves the world when you log out for one. This isn't currently possible without a load of hacks.
  • mv3d, sql, mmorpgs, scalability, storage, orms

    Previous post Next post
    Up