Optimizing MySQL server settings


Octocat **Promotion** - Efficiently manage your coding bookmarks, aka #codingmarks, on www.codingmarks.org and share your hidden gems with the world. They will be published weekly on Github. Please help us build THE programming-resources location - Star


Podcastpedia.org uses the MySQL database, in version 5.5.32 (# mysql -V),  to store metadata (title, description, update frequency, url of the image, urls of episodes etc.) of podcasts from the directory. The database contains both MyISAM (for performance and full-text indexing capabilities) and InnoDB tables. An upgrade to version 6 and only to InnoDB tables is considered, once it’s mature enough and GoDaddy will support it. This post presents how MySQL server is configured, and to be more specific how the my.cnf file looks like.

The MySQL Configuration File: my.cnf

You can tweak MySQL server settings by undertaking a proper configuration of MySQL’s configuration file. This file, entitled my.cnf, stores default startup options for both the server and for clients. On Unix systems the configuration file is located at /etc/my.cnf (red hat), but could also be found at /etc/mysql/my.cnf (debian systems). There are several ways you can build this file

  • before MySQL 5.6.8, MySQL distributions include several sample option files that can be used as a basis for tuning the MySQL server. Look for files named my-small.cnf, my-medium.cnf, my-large.cnf, and my-huge.cnf, which are sample files for small, medium, large, and very large systems. On Windows, the extension is .ini rather than .cnf.
    • use a Configuration Wizard/Generator for MySQL – the one from Percona is pretty good
    • look through the manuals and build the file by yourself from scratch

    After doing some research I decided to build the file from scratch, of course with some inspiration from the other two methods. This way I could also learn a few things about the database server. There are MANY MANY options that can be configured on the MySQL database, but I came to the conclusion that it’s better to configure the basic ones correctly(most properties are fine at their defaults) and focus more on proper indexing, query design and query optimization. So here it’s how the my.cnf file looks like

    [mysqld]
    ft_min_word_len = 3
    #GENERAL
    user = mysql
    port = xxxx
    default_storage_engine = InnoDB
    socket = /var/lib/mysql/mysql.sock
    pid_file = /var/lib/mysql/mysql.pid
    
    # DATA STORAGE #
    datadir = /var/lib/mysql
    #INNODB
    innodb_buffer_pool_size = 378M
    innodb_log_file_size = 64M
    innodb_file_per_table=1
    innodb_flush_method = O_DIRECT
    #MyISAM
    key_buffer_size = 64M
    #Logging
    log_error = /var/lib/mysql/mysql-error.log
    slow_query_log = 1
    slow_query_log_file = /var/lib/mysql/mysql-slow.log
    #OTHER
    tmp_table_size = 32M
    max_heap_table_size = 32M
    query_cache_type = 0
    query_cache_size = 00
    max_connections = 500
    thread_cache_size = 50
    table_open_cache = 800
    open_files_limit = 65535
    [client]
    socket = /var/lib/mysql/mysql.sock
    port = xxxx

    The explanation for a few parameters:

    • ft_min_word_len = 3 – the minimum length of words to be indexed in the full-text search index. If you modify this value you need to rebuild the FULLTEXT indexes.
    • user = mysqlmysqld will run as the mysql user account on the operating system
    • port = xxxx – port on which MySQL listens to (default is 3306)
    • default_storage_engine = InnoDB – default storage engine starting with MySQL 5.5. See here the benefits of using InnoDB
    • socket = /var/lib/mysql/mysql.sock – The mysql.sock is the socket that mysqld creates for programs to connect to.
    • pid_file = /var/lib/mysql/mysql.pid – A process id (pid) file is where the process id (pid) assigned by the server is recorded, when a particular instance of MySQL starts
    • datadir = /var/lib/mysql – location of data, popular location on many Unix variants
  • innodb_buffer_pool_size = 378M – the most important option for Innodb Performance. InnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. The default value (128M) is a little bit small for my needs. Ideally, you set the size of the buffer pool to as large a value as practical, leaving enough memory for other processes on the server to run without excessive paging. The larger the buffer pool, the more InnoDB acts like an in-memory database, reading data from disk once and then accessing the data from memory during subsequent reads. The buffer pool even caches data changed by insert and update operations, so that disk writes can be grouped together for better performance.  See also choosing innodb buffer pool size
  • innodb_log_file_size = 64M  – The size in bytes of each log file in a log group. The combined size of log files (innodb_log_file_size * innodb_log_files_in_group) cannot exceed a maximum value that is slightly less than 512GB. A pair of 255 GB log files, for example, would allow you to approach the limit but not exceed it. The default value is 48MB. Sensible values range from 1MB to 1/N-th of the size of the buffer pool, where N is the number of log files in the group. The larger the value, the less checkpoint flush activity is needed in the buffer pool, saving disk I/O. Larger log files also make crash recovery slower, although improvements to recovery performance in MySQL 5.5 and higher make the log file size less of a consideration.
  • innodb_file_per_table=1 – If innodb_file_per_table is disabled (the default), InnoDB creates tables in the shared tablespace. If innodb_file_per_table is enabled (=1), InnoDB creates each new table using its own .ibd file for storing data and indexes, rather than in the shared tablespace.Tablespace also never reduces in size, the same as if you had this disabled, but you have now the possibility to rebuild the table. Follow Reasons to use innodb_file_per_table for more information.
    • innodb_flush_method = O_DIRECT – when set to O_DIRECT the operating system’s caching wil be bypassed. This assures that there is no overhead of double buffering, since innodb_buffer_pool_size was also decently set
    • #MyISAM
  • key_buffer_size = 128M – Index blocks for MyISAM tables are buffered and are shared by all threads. key_buffer_size is the size of the buffer used for index blocks. The key buffer is also known as the key cache. The key_buffer_size is probably the most useful single variable to tweak on the MyISAM engine. Be aware that MyISAM itself caches only indexes, not data. So if possible the value of this setting should cover the size of all your indexes.
    • #LOGGING
    • log_error = /var/lib/mysql/mysql-error.log – Log errors and startup messages to this file. See also “The Error Log”.
    • slow_query_log = 1 – whether the slow query log is enabled. The value can be 0 (or OFF) to disable the log or 1 (or ON) to enable the log.
    • slow_query_log_file = /var/lib/mysql/mysql-slow.log – the name/destination of the slow query log file. The default value is <em class="replaceable"><code>host_name</em>-slow.log</code>, but the initial value can be changed with the --slow_query_log_file option.
  • tmp_table_size = 32M – the maximum size of internal in-memory temporary tables. (The actual limit is determined as the minimum of tmp_table_size and max_heap_table_size.) If an in-memory temporary table exceeds the limit, MySQL automatically converts it to an on-disk MyISAM table. Increase the value of tmp_table_size (and max_heap_table_size if necessary) if you do many advanced GROUP BY queries and you have lots of memory. This variable does not apply to user-created MEMORY tables.You can compare the number of internal on-disk temporary tables created to the total number of internal temporary tables created by comparing the values of the Created_tmp_disk_tables and Created_tmp_tables variables.
  • max_heap_table_size = 32M – this variable sets the maximum size to which user-created MEMORY tables are permitted to grow. The value of the variable is used to calculate MEMORY table MAX_ROWS values. Setting this variable has no effect on any existing MEMORY table, unless the table is re-created with a statement such as CREATE TABLE or altered with ALTER TABLE or TRUNCATE TABLE. A server restart also sets the maximum size of existing MEMORY tables to the global max_heap_table_size value. This variable is also used in conjunction with tmp_table_size to limit the size of internal in-memory tables.
    • query_cache_type = 0 – set the query cache type. A value of 0 or OFF means do not cache results in or retrieve results from the query cache. Note that this does not deallocate the query cache buffer. To do that, you should set query_cache_size to 0.
  • query_cache_size = 00 – the amount of memory allocated for caching query results. By default, the query cache is disabled. I prefer doing caching at the application level with EhCache – this will be explained in a future post
    • max_connections = 377 – the maximum permitted number of simultaneous client connections. If you get a Too many connections error when you try to connect to the mysqld server, this means that all available connections are in use by other clients.
  • thread_cache_size = 54 – how many threads the server should cache for reuse. When a client disconnects, the client’s threads are put in the cache if there are fewer than thread_cache_size threads there. Requests for threads are satisfied by reusing threads taken from the cache if possible, and only when the cache is empty is a new thread created. This variable can be increased to improve performance if you have a lot of new connections. Normally, this does not provide a notable performance improvement if you have a good thread implementation. However, if your server sees hundreds of connections per second you should normally set thread_cache_size high enough so that most new connections use cached threads. By examining the difference between the Connections (SHOW GLOBAL STATUS LIKE 'Connections';) and Threads_created (SHOW GLOBAL STATUS LIKE 'Threads_created';) status variables, you can see how efficient the thread cache is.
  • table_open_cache = 600 – The number of open tables for all threads. Increasing this value increases the number of file descriptors that mysqld requires. You can check whether you need to increase the table cache by checking the Opened_tables (SHOW GLOBAL STATUS LIKE 'open_tables';) status variable.
  • To verify if the variable has been set correctly run ” show variable like 'VARIABLE_NAME' ” (e.g. show variables like 'max_connections') in the MySQL console.

    The calculation was based on the assumptions that MySQL has 756MB of RAM available, a database size around 250MB and maximum 300 users trying to access the system (the underlying machine is Cent OS 6.4 virtual private server with 2GB of RAM and 8 cores processor). The same assumptions were made trying to configure the Tomcat JDBC Connection Pool – see my post Tomcat JDBC Connection Pool configuration for production and development on that. Of course if resource demand grows (fingers crossed 🙂 ), I would have to adjust.

    Post disclaimer
    I am by no means a DB/MySql Expert, I come from a development background, so please take the configuration as is. You should treat your application/system optimization, as a holistic approach including OS, DB, web server, application server, web caching, 2nd level caching, etc. I posted this here to easily remember what everything means in the configuration file, why I set these values, to learn something and hopefully to get your feedback if you see any room improvement.

    If you really want an expert’s opinion, I highly recommend Baron Schwartz’ book – High Perfomance MySQL:

    You can find in it advanced stuff around MySQL performance like query performance optimization, indexing for high performance, optimizing schema and data types, OS and hardware optimization and so much more.

    If you liked this, please show your support by following us and joining us on Podcastpedia.org

    We promise to only share high quality podcasts and episodes.

    References

    1. MySQL 5.5 reference manual
    2. MyISAM Storage Engine
    3. InnoDB Storage Engine
    4. MySQL Performance Blog
    Podcastpedia image

    Adrian Matei

    Creator of Podcastpedia.org and Codingpedia.org, computer science engineer, husband, father, curious and passionate about science, computers, software, education, economics, social equity, philosophy - but these are just outside labels and not that important, deep inside we are all just consciousness, right?

    Parallel calls with async-await in javascript - I promise you all performance and simplicity

    I was blown away about the simplicity and performance gain of making parallel calls with the new async-await feature in javascript. See the blog post to understand why. Continue reading