Celtic Heroes

The Official Forum for Celtic Heroes, the 3D MMORPG for iOS and Android Devices

About the auction house and the performance issues

#1
I think everyone can imagine why the new under the hammer update brings the servers down to their knees.
Basicly the under the hammer update is a big performance sucking machine. The servers have to deal with all the informations going in every second
For example:
Item x is in auction
Who is the current owner of item x
How much he/she wants for item x
How much time left to bid on item x
Who bids on item x
How high is the current bid on item x
He needs to inform other bidders if they r outbid and he needs to return their gold...
And so on and so forth

Imagine this its no wonder (in my opinion)
That the server at some point cant handle all of that and goes down.

Brings us to the question:
How to fix that?

I myself think there r only two maybe 3 ways to solve this problem.

1. The problem is the under the hammer update
So one possibility is just to delete the auction house (what i dont really want because its a sweet add to the game and makes it more easier to get fashion, lixes etc)

2. It is possible to move the auction house to another server.
For example fingal server is connected to fingal auction house server. To enter fingal auction house server you just need to travel to a, lets call it portal which moves you automatically from fingal main server to fingal auction house server and back. (but well i can understand that this way is a bit difficult to realise)

Thats my opinion and sorry for my bad grammar i hope everyone can understand what i mean.


Ps: please correct me if i am wrong

Re: About the auction house and the performance issues

#2
Oh man! Trying to debug code issues without reading the code or having a deep understanding of the system architecture sounds like fun.
Here is my stab at it:
tem x is in auction
Who is the current owner of item x
How much he/she wants for item x
How much time left to bid on item x
Who bids on item x
How high is the current bid on item x
He needs to inform other bidders if they r outbid and he needs to return their gold...
And so on and so forth
We can store all of this information other than current bid in a single database table.
Current bid is likely its own table, which says who is bidding on it, and what the current bid is.

If you put the right indexes in place this should be pretty fast. Especially if you clean up expired rows with a cronjob. I suspect the AH is already on its own api server. Because the AH look ups (likely cached) are super fast. I honestly don't think the AH is the bottle neck at all.

I think the load is actually based on the email system. Every time someone loses/wins an auction people run over and check their email to get the item/gold.
I was in 3 bidding wars this weekend and I was constantly checking my email. The latency of the email system was 3000ms - 5000ms, and often failed entirely. I would sometimes need to open the mail 5-10 times to get my gold back so i could continue my bidding wars. This is insane latency for an api endpoint. The SLA's should be sub 500 ms.

The problem here is the email server likely was never designed to handle the load of people checking their emails 100x as much. If it is a relational db they need to make sure they have indexes on the user_id column. They probably already have this. If this is the case, one way we could speed it up is by deleting old emails. Hopefully OTM clears deleted emails from the db with a cronjob, but I suspect they do not do this. All of these assumptions are based on the assumption that they use a relational database for their persistent storage. OTM just needs to improve the scalability of the email server. My guess is the email server is on the same server as the login server which is what is causing the problems with the initial login. Once you were logged in, gameplay was fine for all aspects other than checking email. They just need to upgrade their AWS instance for the login server. I suspect the database is the bottleneck. If thats the case they should checkout RDS by AWS. Its pretty awesome, and can be upgraded to a larger instance in seconds and scales horizontally beautifully. If the bottleneck is the login server they might consider beanstalk IF they can get away with running the binary without any standalone processes. EC2 can be setup to scale like beanstalks if they do require standalone processes on the login server api instances.

Scaling a service is difficult. Its completely reasonable for the devs to make the assumption that they could rely on the previously stable email server for handling AH transactions. Learning that it would increase the load on email/login server to dangerous levels during peak traffic is definitely a gotcha. Hopefully they have monitoring in place to catch this sort of thing. I hope they dont roll back the AH
215+ Druid
220+ Mage
220+ Warrior
220+ Rogue
215+ Ranger

Who is online

Users browsing this forum: No registered users and 1 guest