Graphite is one of the first line troubleshooting tools for Shokunin's clients and most clients run it on Amazon EC2. Through trial and error, we have established a few best practice rules for setting up in the cloud.
Python has a GIL which means data collection processes are limited to a single core. Run more than one process to keep from maxing out a single core. Have each cache process write to a separate EBS volume.
################################################ # Puppet Controlled # /opt/graphite/conf/carbon.conf ################################################ # Don't write to the default port keep it to catch misconfigured # clients and fix them later [cache] MAX_CACHE_SIZE = inf LINE_RECEIVER_INTERFACE = 0.0.0.0 LINE_RECEIVER_PORT = 2003 ENABLE_UDP_LISTENER = False UDP_RECEIVER_INTERFACE = 0.0.0.0 UDP_RECEIVER_PORT = 2003 PICKLE_RECEIVER_INTERFACE = 0.0.0.0 PICKLE_RECEIVER_PORT = 2004 USE_INSECURE_UNPICKLER = False CACHE_QUERY_INTERFACE = 0.0.0.0 CACHE_QUERY_PORT = 7002 USE_FLOW_CONTROL = True LOG_UPDATES = False LOG_CACHE_HITS = False WHISPER_AUTOFLUSH = False ############################################### # [cache:01] STORAGE_DIR = /graphite_data/01 LOCAL_DATA_DIR = /graphite_data/01 MAX_CACHE_SIZE = inf MAX_UPDATES_PER_SECOND = 1000 MAX_CREATES_PER_MINUTE = 50 LINE_RECEIVER_INTERFACE = 0.0.0.0 LINE_RECEIVER_PORT = 2013 ENABLE_UDP_LISTENER = False UDP_RECEIVER_INTERFACE = 0.0.0.0 UDP_RECEIVER_PORT = 2013 PICKLE_RECEIVER_INTERFACE = 0.0.0.0 PICKLE_RECEIVER_PORT = 2014 USE_INSECURE_UNPICKLER = False CACHE_QUERY_INTERFACE = 0.0.0.0 CACHE_QUERY_PORT = 7012 USE_FLOW_CONTROL = True LOG_UPDATES = False LOG_CACHE_HITS = False WHISPER_AUTOFLUSH = False ############################################### [cache:02] STORAGE_DIR = /graphite_data/02 LOCAL_DATA_DIR = /graphite_data/02 MAX_CACHE_SIZE = inf MAX_UPDATES_PER_SECOND = 1000 MAX_CREATES_PER_MINUTE = 50 LINE_RECEIVER_INTERFACE = 0.0.0.0 LINE_RECEIVER_PORT = 2023 ENABLE_UDP_LISTENER = False UDP_RECEIVER_INTERFACE = 0.0.0.0 UDP_RECEIVER_PORT = 2023 PICKLE_RECEIVER_INTERFACE = 0.0.0.0 PICKLE_RECEIVER_PORT = 2024 USE_INSECURE_UNPICKLER = False CACHE_QUERY_INTERFACE = 0.0.0.0 CACHE_QUERY_PORT = 7022 USE_FLOW_CONTROL = True LOG_UPDATES = False LOG_CACHE_HITS = False WHISPER_AUTOFLUSH = False ###############################################
You will need to modify local_settings.py to make it aware of the new storage locations, by adding the following:
#/opt/graphite/webapp/graphite/local_settings.py STANDARD_DIRS = ['/graphite_data/01', '/graphite_data/02', '/graphite_data/03', '/graphite_data/04']
While there are other collectors, we prefer collectd because it's light compiled C and has plugins for all major infrastructure compenents (Apache, Nginx, Mysql, Redis, Java JMX) and it is simple to write other plugins. Example plugin and config
###############################################################
# Puppet Controlled Default Template
###############################################################
FQDNLookup false
LoadPlugin syslog
<Plugin syslog>
LogLevel info
</Plugin>
LoadPlugin cpu
LoadPlugin disk
LoadPlugin interface
LoadPlugin memory
LoadPlugin network
LoadPlugin swap
LoadPlugin vmem
LoadPlugin write_graphite
<Plugin "write_graphite">
<Carbon>
Host "<%= graphite_server %>"
Port "<%= graphite_port %>"
Prefix "infra.<%= server_role %>."
EscapeCharacter "_"
StoreRates true
AlwaysAppendDS false
</Carbon>
</Plugin>
###################################