File contents
<presentation filename="ASPNZope.pdf">
<stylesheet module="activestate" function="getParagraphStyles"/>
<title>Scaling Zope for ASPN</title>
<author>
Andy McKay
</author>
<section name="Title">
<fixedimage filename="background.jpg" x="0" y="0"/>
<slide title="TalkTitle" id="Slide001">
<frame x="125" y="100" width="600" height="300">
<para style="Title">
Scaling Zope for ASPN
</para>
<para style="BigCentered">
Andy McKay, ActiveState
</para>
</frame>
</slide>
</section>
<section name="Main">
<fixedimage filename="backgroundSlide.jpg" x="0" y="0"/>
<slide title="Agenda" id="Slide001">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Agenda</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">Specifications</para>
<para style="Bullet1">Building it</para>
<para style="Bullet1">Scaling it</para>
<para style="Bullet1">Lessons learned</para>
</frame>
</slide>
<slide title="Specifications" id="Slide002">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Specifications</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">ActiveState Programmers Network (ASPN)</para>
<para style="Bullet2">Modeled on MSDN</para>
<para style="Bullet2">ASPN is a subscription model product that provides a bundle of software and services</para>
<para style="Bullet2">Free level ASPN Open</para>
<para style="Bullet1">ASPN membership included with products</para>
<para style="Bullet1">Data restricted by level</para>
</frame>
</slide>
<slide title="Features" id="Slide003">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Features</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">First release:</para>
<para style="Bullet2">Mailing list archives, Cookbook's (Python Cookbook etc)</para>
<para style="Bullet2">Documentation and books</para>
<para style="Bullet2">Personalisation based on access</para>
<para style="Bullet1">Second release:</para>
<para style="Bullet2">Modules (eg PIL, DBI etc)</para>
<para style="Bullet2">Web Services and Passport interfaces</para>
<para style="Bullet2">On-line chat, polls, customisable home page</para>
</frame>
</slide>
<slide title="Side track: current ActiveState site" id="Slide004">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Side track: current ActiveState site</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">Scaling Zope the easy way, "Baking"</para>
<para style="Bullet1">Back end built in Zope</para>
<para style="Bullet2">Ugly, ugly DTML and ZClasses</para>
<para style="Bullet1">Perl script writes pages out to IIS</para>
<para style="Bullet1">PerlEx to serve out dynamic stuff</para>
<para style="Bullet1">Works great, get to use Zope and keep it very scalable</para>
</frame>
</slide>
<slide title="Current ActiveState site" id="Slide005">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Current ActiveState site</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">Good bits</para>
<para style="Bullet2">Fast</para>
<para style="Bullet2">Zope back end totally isolated</para>
<para style="Bullet2">Chance to alter content</para>
<para style="Bullet1">Bad bits</para>
<para style="Bullet2">Administering IIS real pain</para>
<para style="Bullet2">Perl script</para>
<para style="Bullet2">URLs not the same in IIS as Zope due to acquistion</para>
<para style="Bullet2">Essentially static</para>
</frame>
</slide>
<slide title="Back to ASPN... Resources" id="Slide006">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Back to ASPN... Resources</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">Given 3 weeks from specification to launch for most of the features</para>
<para style="Bullet1">Resources:</para>
<para style="Bullet2">4 Developers with lots of Perl, SQL and a bit of Zope experience</para>
<para style="Bullet2">1 HTML / Front end person</para>
<para style="Bullet2">Average of 75% of time to spend on it</para>
</frame>
</slide>
<slide title="Choosing a design (1)" id="Slide007">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Choosing a design (1)</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="BodyText">Zope was not a shoe in, considered alternatives eg: PHP, mod_perl, PerlEx etc</para>
<para style="Bullet1">Fastest execution, most scalable, most expensive: Apache, mod_perl and Oracle</para>
<para style="Bullet1">Easiest to develop: Apache, Zope</para>
<para style="Bullet1">Time to develop was more important</para>
<para style="Bullet1">Ability to integrate with existing site helpful</para>
</frame>
</slide>
<slide title="Choosing a design (2)" id="Slide008">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Choosing a design (2)</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="BodyText">After much debating we chose:</para>
<para style="Bullet1">Multiple Zope front end</para>
<para style="Bullet1">MS SQL Server back end</para>
<para style="Bullet2">More faith in SQL than ZCatalog</para>
<para style="Bullet2">Already been burned by ZODB and ZCatalog in Zope 2.1.6</para>
<para style="Bullet2">ZEO not needed</para>
<para style="Bullet1">Cisco router for load balancing</para>
</frame>
</slide>
<slide title="Design" id="Slide009">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Design</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<image filename="diagram.jpg" />
</frame>
</slide>
<slide title="Testing" id="Slide010">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Testing</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="BodyText">After some rapid development and one Python conference...</para>
<para style="BodyText">With two days left we were able to devote some time to meaningful testing unfortunately</para>
<para style="Bullet1">It wasn't very fast but we had no time left to do major overhauls</para>
<para style="Bullet1">Wrote some simple testing tools using</para>
<para style="Bullet2">xmlrpc to automate some unit tests</para>
<para style="Bullet2">webchat to automate testing html</para>
</frame>
</slide>
<slide title="Traffic (approx)" id="Slide011">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Traffic (approx)</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">10 mn hits per month</para>
<para style="Bullet1">500,000 visits</para>
<para style="Bullet1">2 ms to serve a page (?)</para>
<para style="Bullet1">about 76 Gigs a month of traffic</para>
<para style="Bullet1">Similar size to Zope.org</para>
<para style="Bullet1">Not as much as we had hoped</para>
</frame>
</slide>
<slide title="Database Size" id="Slide012">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Database Size</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="BodyText">As of 30 Jan 2002 there are:</para>
<para style="Bullet1">800k e-mails from 142 mailing lists</para>
<para style="Bullet2">growing at 10,000 a day</para>
<para style="Bullet2">indexing takes about 1 hour on our servers (750 PIII, 512MB)</para>
<para style="Bullet1">40k pages of other content</para>
<para style="Bullet1">34k registered users</para>
</frame>
</slide>
<slide title="Performance Issues" id="Slide013">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Performance Issues</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">ZODBCA single threaded</para>
<para style="Bullet1">Too many requests coming through to Zope</para>
<para style="Bullet2">Some images</para>
<para style="Bullet1">Some sloppy coding</para>
<para style="Bullet2">eg: cyclic reference (adding REQUEST to REQUEST)</para>
<para style="Bullet2">returning large results to DTML rather than a good SQL statement</para>
<para style="Bullet1">Memory leaks</para>
<para style="Bullet2">Leak finder from Shane</para>
<para style="Bullet1">SQL indexing slows down the database</para>
</frame>
</slide>
<slide title="Lessons Learned (SQL)" id="Slide014">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Lessons Learned (SQL)</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">MS SQL is not as great as we thought</para>
<para style="Bullet2">Slow to index content</para>
<para style="Bullet2">More problems with SQL and connection than anything else</para>
<para style="Bullet2">Properly used ZEO and ZCatalog is probably better for full text</para>
<para style="Bullet1">Tight control, unfortunately people write stuff in Zope because its easier then it doesn't get indexed</para>
</frame>
</slide>
<slide title="Lessons Learned (Caching)" id="Slide015">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Lessons Learned (Caching)</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">Redesign the front end to make caching easier</para>
<para style="Bullet2">Unfortunately lots of ASPN is designed to be dynamic</para>
<para style="Bullet2">The most popular pages (Home, mailing list are dynamic)</para>
<para style="Bullet1">Cache as much as possible</para>
<para style="Bullet2">Start with the easy things: images, css, static content</para>
<para style="Bullet1">Caching per thread is nowhere near as useful as caching across all the threads (especially if you are running with lots of threads)</para>
</frame>
</slide>
<slide title="Things we got right" id="Slide016">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Things we got right</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">Separating development and production totally</para>
<para style="Bullet1">Trying to avoid DTML when possible (not enough)</para>
<para style="Bullet2">Writing Python products is good, anything else is bad</para>
<para style="Bullet1">Having separate database that can be written to through many different ways</para>
<para style="Bullet1">Using a SQL back end</para>
<para style="Bullet1">Once the content switch was stable it rocked</para>
<para style="Bullet2">Fully recommended if you can afford it, does load balancing, fail over etc.</para>
</frame>
</slide>
<slide title="Improvements" id="Slide017">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Improvements</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">Waiting for stable Zope (2.4.4 looking good so far)</para>
<para style="Bullet2">ZmxODBCA working plus caching SQL statements</para>
<para style="Bullet2">RAM cache manager and Accelerated cache manager</para>
<para style="Bullet2">Consolidating DTML</para>
<para style="Bullet2">Removing Site Access</para>
<para style="Bullet1">Performance result:</para>
<para style="Bullet2">Approx 2x performance</para>
<para style="Bullet2">Hard to measure until into production</para>
</frame>
</slide>
<slide title="Questions?" id="Slide018">
<infostring align="right" x="800" y="36" size="14" font="Helvetica"># %(page)s</infostring>
<frame x="10" y="485" width="790" height="75">
<para style="Heading1">Questions?</para>
</frame>
<frame x="50" y="75" width="750" height="400">
<para style="Bullet1">ASPN: http://aspn.ActiveState.com</para>
<para style="Bullet1">WWW: http://www.ActiveState.com</para>
</frame>
</slide>
</section>
</presentation>