Scaling Zope for ASPN Andy McKay
Scaling Zope for ASPN Andy McKay, ActiveState
# %(page)s Agenda Specifications Building it Scaling it Lessons learned # %(page)s Specifications ActiveState Programmers Network (ASPN) Modeled on MSDN ASPN is a subscription model product that provides a bundle of software and services Free level ASPN Open ASPN membership included with products Data restricted by level # %(page)s Features First release: Mailing list archives, Cookbook's (Python Cookbook etc) Documentation and books Personalisation based on access Second release: Modules (eg PIL, DBI etc) Web Services and Passport interfaces On-line chat, polls, customisable home page # %(page)s Side track: current ActiveState site Scaling Zope the easy way, "Baking" Back end built in Zope Ugly, ugly DTML and ZClasses Perl script writes pages out to IIS PerlEx to serve out dynamic stuff Works great, get to use Zope and keep it very scalable # %(page)s Current ActiveState site Good bits Fast Zope back end totally isolated Chance to alter content Bad bits Administering IIS real pain Perl script URLs not the same in IIS as Zope due to acquistion Essentially static # %(page)s Back to ASPN... Resources Given 3 weeks from specification to launch for most of the features Resources: 4 Developers with lots of Perl, SQL and a bit of Zope experience 1 HTML / Front end person Average of 75% of time to spend on it # %(page)s Choosing a design (1) Zope was not a shoe in, considered alternatives eg: PHP, mod_perl, PerlEx etc Fastest execution, most scalable, most expensive: Apache, mod_perl and Oracle Easiest to develop: Apache, Zope Time to develop was more important Ability to integrate with existing site helpful # %(page)s Choosing a design (2) After much debating we chose: Multiple Zope front end MS SQL Server back end More faith in SQL than ZCatalog Already been burned by ZODB and ZCatalog in Zope 2.1.6 ZEO not needed Cisco router for load balancing # %(page)s Design # %(page)s Testing After some rapid development and one Python conference... With two days left we were able to devote some time to meaningful testing unfortunately It wasn't very fast but we had no time left to do major overhauls Wrote some simple testing tools using xmlrpc to automate some unit tests webchat to automate testing html # %(page)s Traffic (approx) 10 mn hits per month 500,000 visits 2 ms to serve a page (?) about 76 Gigs a month of traffic Similar size to Zope.org Not as much as we had hoped # %(page)s Database Size As of 30 Jan 2002 there are: 800k e-mails from 142 mailing lists growing at 10,000 a day indexing takes about 1 hour on our servers (750 PIII, 512MB) 40k pages of other content 34k registered users # %(page)s Performance Issues ZODBCA single threaded Too many requests coming through to Zope Some images Some sloppy coding eg: cyclic reference (adding REQUEST to REQUEST) returning large results to DTML rather than a good SQL statement Memory leaks Leak finder from Shane SQL indexing slows down the database # %(page)s Lessons Learned (SQL) MS SQL is not as great as we thought Slow to index content More problems with SQL and connection than anything else Properly used ZEO and ZCatalog is probably better for full text Tight control, unfortunately people write stuff in Zope because its easier then it doesn't get indexed # %(page)s Lessons Learned (Caching) Redesign the front end to make caching easier Unfortunately lots of ASPN is designed to be dynamic The most popular pages (Home, mailing list are dynamic) Cache as much as possible Start with the easy things: images, css, static content Caching per thread is nowhere near as useful as caching across all the threads (especially if you are running with lots of threads) # %(page)s Things we got right Separating development and production totally Trying to avoid DTML when possible (not enough) Writing Python products is good, anything else is bad Having separate database that can be written to through many different ways Using a SQL back end Once the content switch was stable it rocked Fully recommended if you can afford it, does load balancing, fail over etc. # %(page)s Improvements Waiting for stable Zope (2.4.4 looking good so far) ZmxODBCA working plus caching SQL statements RAM cache manager and Accelerated cache manager Consolidating DTML Removing Site Access Performance result: Approx 2x performance Hard to measure until into production # %(page)s Questions? ASPN: http://aspn.ActiveState.com WWW: http://www.ActiveState.com