Nice Hybris performance/design post.
Given that the original blog is inactive I'm copying and structuring some of the content below:
- Data structure / DB
- Indexing – Add DB indexes according to your model customization. (High priority) You would like to drop unused indexes also, so that DB should not waste it’s time. (Medium priority)
- Ensure your custom types defined into items xml are persisting data into its own table rather than piggy banking on generic table. (High priority)
- Any relation-end that does not need to be ordered should have this attribute set to false. A common mistake is that developers cut-and-paste type and relation definitions in items.xml files which may result in unintentionally setting relation ends to be ordered.(High Priority)
- If no deployment table is defined for a many-to-many relation, a generic join table is used to store relations and is not optimal for performance. So ensure you define rel table for your custom many to many relationships.
- Avoid defining collection types. To maintain data integrity, always use a relation.
- Front end webserver
- Leverage web server modules for browser caching and compression. Don’t forget disabling unused apache modules. (Low priority)
- Ensure Hybris app servers are not serving static content. (High priority) Use CDN for static media/pages delivery. This may reduce 90% load from your infrastructure. If above not possible then simply ensure web server layer fronting hybris serving static media or scale externally.
- Perform SSL termination at load balancer/web server. Don’t trouble hybris app servers for SSL handshake. (Medium priority)
- Ensure only required extensions are deployed on front end hybris servers and avoid deploying any back office extension on front end. This is good for security and performance. (Medium priority)
- Log/audit
- Ensure loggers configuration is not too verbose in production. (Low)
- Cron job logs. Hmc does not have pagination and this kills your hmc when you open a job with thousands of log files. (Medium Priority)
- Avoid creating history records if possible or purge them on regular interval. Creating a history record by using auditing service might be fancy but can become a big performance overhead very quickly. Imagine creating an audit record for each stock level change. (Medium Priority)
- Catalogs
- Avoid maintaining/developing catalog version aware types. If cannot then avoid making them part of catalog sync. e.g price - Touching price daily for every product may result into full cat sync. Either maintain prices with online version only or at the time of create/update update both version. This way you can remove price item type from catalog sync.
- If hybris is not your PIM and you have some other system where merchandising team perform preview before publishing a product then you really don’t need multiple catalog version in hybris. Here you can save lots of overhead from hybris. (High priority)
- Stock service
- Ensure you are not checking stock on loading of product detail page, category listing pages or add/remove event on basket page. Do this check only on add item into basket, submission of basket page and just before taking payment (i.e. you need to reserve stock here). (High priority)
- Ensure hybris stock service is used to check stock status and no logic is written on Stock Model directly. (High priority)
- Ensure JALO and web app session time out are configured to same value. (High priority)
- Clustering
- 4 load balanced hybris app servers should be more than sufficient for a decent load, if solution is designed correctly.
- TCP/UDP clustering configuration doesn’t matter much for 4 app server cluster. But prefer sticking with default UDP settings. Hybris works well for either case until you have some serious networking issue. Use udpsniff to validate packets (Low priority)
- Set the minimum and maximum heap sizes for the JVM to the same value. 8GB should be more than sufficient. More memory means longer GC pause and GC pause means all threads on halt.
- Search
- Ensure solution is designed properly. You should be really very smart here.
- Run Solr in standalone mode with one master server. Perform delta Index frequently 10-15 mins. Prefer two-phase mode so that your site should not stuck while indexing happening behind the scene. Run full index update once in a week or only when you perform schema change. (High Priority)
- Use Solr as much as you can. Because this way you can save DB calls. Hybris is very chatty with DB because of it’s cache refreshment and lazy loading concept. I really hate this part of hybris but now used to live with this and identified ways to avoid implementing DB centric solution. E.g. you can use solr to render category listing pages, you can index price and stock data and use this data as much as you can. Whole objective touch DB when it is really necessary. (High priority)
- < Disable Quick Search in hmc on the front-end and back-end hybris application servers. (Low priority)
- Use Out Of the Box functionality whenever possible
- Order Invoice generation/re-generation in pdf format.
- Purging/archive items older than 30 days: you can configure this manually through hmc or write an impex.
- Ensure you use hybris WCMS Navigation node design for Mega menu construction so that you should not end-up preparing nested category hierarchy with every hit.