– /u/Cantsa1 disclaimer I have worked in a data center for over 5 years and I am BISCI Data Center Best Practices certified service tech so I do have knowledge of data center workings. I will also try to be as less technical as I can and sorry for the long post.
First off, after reading all the posts and displeasure with SE atm about the errors which have occurred is completely understandable and I sympathize with you all who are affected with this issue. Second, I will give some insight on data center workings to try and explain why these errors are popping up.
This is my speculation on what SE’s NA/EU data centers capacity atm and I am giving SE the benefit of the doubt here because I hope they did this.
First off I believe that this open beta is a stress test as stated. I have been in a few myself and here are usually the parameters of the stress test. The test is to try to establish normal. What this means is that between the OE (operational equipment) and the NOE (nonoperational equipment, equipment meant to act as redundancy), limits are set to what the administrators feel would be normal traffic going through the servers. They set benchmarks and limits the system to see if the servers can run at this ideal level without considerable hiccups. This at the most I have seen is usually around 20-40% capacity of the server. During this process the NOE is ready to come online and running in case it does not met this requirement.
Case in point SE’s ability to turn on 2 new worlds during the beta because the demand was so high to play and the problems with getting access was frustrating. After the testing phase the administrators know what the load balance is either to low or high and can be adjusted by either adding support to the existing servers and also adding redundancy to the system as well. Which include adding more servers, electrical, cooling, etc. What I expect is SE’s data center must be at least between the range of 6000 – 10000 square feet to accomplish this correctly.
Also, if I am correct SE doesn’t have each world on one server. There are several servers working and communicating together to minimize the load on each piece of equipment. This means that there are probably 4-7 servers per area for each world. Ex. Ul’dah has several servers just for the area for all worlds. This would make since so that each area can handle the amount of avatars in each area.
Here is another example of what I mean:
Client servers – FFXIV client Zone servers – Each area has dedicated servers for all worlds Character servers – Save character information and logs Log Servers – Error logs for bugs and issues Firewall servers – Protection servers Dungeon servers – Self explanatory
Now with this explained I believe that during this beta they set several restrictions to try to establish a normal processing output. What caught them off guard was the amount of response they received for this beta. Which is good and bad. What they thought would be normal was now thrown out the window causing a lot of the errors because there parameters where set to low. Causing a lot of problems. Also they are trying to scan the forums as well for bugs to address them. This takes manpower away from fixing any issues as it takes time for logs to develop from both the forums and the system logs.
Currently they are looking through the log servers logging how the players received these areas. Some people have been able to log back on some have not. Here is the related article.
So in conclusion I believe that the real issue is that SE set its expectations to low and set minimal requirements on previous data from v1.0 and the other phase betas and didn’t expect for the outpour of support that this game has gotten in recent months since the Reboot. The reason, I believe, why SE didn’t just open up more space because it gives them the perfect scenario to deal with these issues. Although unexpected and a lot of people are pissed because of it. It allowed them to better implement a solution before early access and launch. I also believe that they only used about half of the available servers used for the game in this beta. Meaning, now this is just a speculation, half of the servers dedicated to each world are not being used and not designated as NOE. Turning them on during these issues would not give them viable data but would have allowed more people to play on particular worlds with their friends and in general. Since this is such a short beta period it isn’t wise for them to fire up the remaining servers when trying to establish a normal operating process.
Now like I said I am giving SE the benefit of the doubt here. I just hope there taking the opportunity to learn from this and realize it’s better to overestimate normal than to underestimate it. Here is hoping also that EA and Launch runs smoothly.
EDIT:
I have read a lot of your post and I want to say something in retrospect. While this is all speculation on what SE is doing. In no way does it mean that they are actually doing something similar. I can just understand from this particular standpoint from similar situations. If these issues progress past EA and launch and linger then they have a deeper problem on there hands. Here is hoping I am right on my analysis on how there handling these issues.
Vice Black Leader replied
619 weeks ago