Windows Cloud y Live Mesh

Filed Under (Windows, live mesh, windows cloud) by Antonio Ortiz on 06-10-2008

Tagged Under : , , , , , , ,

Mesh

Microsoft tiene vocación de plataforma, de controlar y servir la base sobre la que otros desarrollan productos y servicios. Eso ha sido Windows en el escritorio del usuario y el servidor de la empresa los últimos 20 años, así que no debería extrañarnos de que Ballmer hable de Windows Cloud (no será su nombre real, las declaraciones en The Register), un nuevo sistema operativo “en la nube” que será presentado dentro de unas semanas en su Professional Developers Conference.

Claro que Ballmer no da muchos detalles acerca de lo que será este Windows Cloud, a lo que para añadir confusión hay que sumar que lo de “la nube” se ha convertido en la expresión de moda para casi cualquier cosa que se hace en la web. Hasta ahora Microsoft se ha aproximado a la tendencia de datos y aplicaciones en la web de varias formas: Silverlight como tecnología para las RIA, software más servicios como alternativa al software como servicio a través del navegador y Mesh como futura plataforma de sincronización.

Este escenario, unido a que la presentación se hará en un evento para desarrolladores, me lleva a pensar que los tiros irán por una primera versión de Mesh en abierto y con aplicaciones desarrolladas. Si recordamos el enfoque de Mesh como plataforma, vemos que encaja con las palabras de Ballmer y con el resto de piezas en el puzzle de la reorientación de Microsoft hacia la nube: experiencias más potentes en el navegador con Silverlight, software más servicios con los datos y parte de la inteligencia “en la nube” y sincronización entre dispositivos y plataformas. Imagino que en el PDC lo que van a hacer público es la apertura para el desarrollo de aplicaciones sobre Mesh.

Claro que todo esto es, todavía, especulación. De hecho podríamos pensar en una versión propia de lo que ya hace Amazon con Windows Server o plantearnos como hace Enrique si se trata de una versión ligera de Windows para ultraportátiles. Quizás lo más sensato haya sido la postura de Sacha, que deja la incógnita en el aire ¿o debería decir en la nube?.

Links for 2008-10-02 [del.icio.us]

Filed Under (varios) by Pere MAJORAL on 03-10-2008

Tagged Under : , ,

Links for 2008-09-26 [del.icio.us]

Filed Under (varios) by Pere MAJORAL on 27-09-2008

Tagged Under : , , , ,

Oracle Enters the AWS Cloud

Filed Under (Announcements) by AWS Editor on 22-09-2008

Tagged Under : , , , , ,

We’ve been working with Oracle to bring a number of their products into the cloud. The first fruits of this work are now ready: cloud-compatible licensing, EC2 AMIs preloaded with a variety of Oracle products, support programs, backup to the cloud, and a cloud management portal.

As more and more enterprises take a look at the Amazon Web Services, they invariably ask about packaged software, particularly databases. With this announcement, AWS users now gain access to a commercial-grade, brand-name database, along with the necessary tools and middleware needed to build and host heavy duty enterprise applications in the Amazon cloud.

So, what’s available?

Oracle_openworld
The Oracle Database 11g, Oracle Fusion Middleware, and Oracle Enterprise Manager can now be licensed to run in the cloud on Amazon EC2. Customers can even use their existing software licenses with no additional license fees. Read more about cloud licensing here.

I should say a few words about licensing here because this question comes up all the time. The variability and flexibility of cloud-based licensing has perplexed users and vendors for some time now. Now that a large software vendor has made a clear statement of direction here, we should see more and more cloud-compatible licenses before too long.

These products, along with Oracle Enterprise Linux, are available in prepackaged, ready to run form, encapsulated within a set of free Amazon EC2 AMIs. Using these AMIs, new instances can be launched and ready within minutes. Of course, Oracle’s development tools — Oracle Application Express, Oracle JDeveloper, Oracle Enterprise Pack for Eclipse, and Oracle Workshop for WebLogic — can all be used to build applications for this new environment.

What does this mean? Instead of budgeting for and acquiring hardware, setting it up, installing an operating system and several layers of complex packages, you can simply launch one of these AMIs on EC2 and be up and  running in minutes. This is definitely no-fuss, no-muss application development and deployment.

But wait, there’s more…

Oracle Enterprise Linux on EC2 is fully supported by Oracle Unbreakable Support and Amazon Premium Support. Once again, another potential adoption barrier has been lowered. If you’ve got a problem, Oracle and Amazon are ready to help out.

There’s also a secure backup solution for database servers running on EC2 or within the corporate network. The new Oracle Secure Backup Cloud Module allows customers to use Amazon S3 as a backup destination with virtually unlimited capacity, obviating the need to deal with local backup devices. The module encrypts backups and makes use of multiple connections to S3 to maximize throughput.

Need I even talk about how painful and expensive backup used to be? Buying expensive devices and media, keeping the media safe and secure offsite (yet still available if needed for a recovery), dealing with physical space issues, and 100 other things. Now, simply send your bits to Amazon S3 and forget about dealing with all of these other issues.

And if that’s not enough, Oracle has also unveiled a new Cloud Management Portal. This is a free, web-based way to manage Oracle software running in the cloud.

These products will be on display at Oracle OpenWorld, which is taking place this week at the Moscone Center in San Francisco. If you are at the conference, please stop by the AWS booth to say hello and to learn more.

I’ll be speaking at the Storage Developer Conference in Santa Clara tomorrow (September 23) and will talk about this offering as well. Once again, say hello.

Oracle_db_backup_wp
Here are some very useful white papers and other resources:

 

  1. Oracle’s Cloud Computing Center, chock full of links, demos, and information.
  2. A data sheet, Oracle In The Cloud.
  3. A white paper, Oracle Data Backup in the Cloud.
  4. Oracle’s Cloud Computing FAQ.
  5. Dynamic demo showing how to launch the Oracle Database on EC2.

– Jeff;

Nuestros datos en la nube, el caso de MobileMe

Filed Under (varios) by Andrés Milleiro on 19-09-2008

Tagged Under : , , , , , , ,

MobileMe
En un futuro cada vez más claro sobre a dónde van las cosas especialmente en movilidad, continúo este pequeño ciclo de artículos hablando sobre la nube y el cloud computing que comencé hablando del concepto de la nube en movilidad.

En esta ocasión ya pasamos un poco al papel, presentando una de las alternativas para poder gestionar nuestros datos en movilidad con “la nube”, y estoy hablando de MobileMe, la solución presentada hace no mucho tiempo por el gigante informático Apple.

Para quien no lo conozca, ¿qué nos aporta MobileMe realmente en movilidad? Básicamente la posibilidad de tener datos completamente sincronizados entre nuestro ordenador (da igual tanto Mac como PC pues hay versión para Windows y se lleva bien con Outlook) y un terminal como un iPod Touch, iPhone u otro ordenador PC o Mac.

En la nube (en este caso los servidores destinados por Apple) dispondremos de correo electrónico, agenda de contactos, calendario, disco duro virtual de 10 Gb y una galería de fotos. Porque además, evidentemente de la propia sincronización entre nuestros terminales, podremos acceder a ellos mediante me.com.

Es evidente que esta solución ya tiene aceptación simplemente por el gigante que es Apple y la cantidad creciente de usuarios que usan sus ordenadores y sus servicios. Existen de hecho alternativas, separadas eso sí y no integradas como este caso, que nos permitirían obtener todos estos servicios gratuitamente para poder trabajar en movilidad con nuestros datos en la nube.

El coste es de 79 euros por año, con lo que dispondremos de todo ello, especialmente los 10Gb de disco duro online que quizá para muchos sea un buen aliciente, pues podremos alojar allí todos nuestros documentos de ofimática o necesarios para trabajar y poder acceder a ellos desde cualquier lugar simplemente con tener una conexión a internet como la que nos puede brindar un módem 3G.

En mi opinión, habrá casos en los que este dinero esté sumamente bien aprovechado y otros en los que no, por lo que hacer un balance general de la aplicación me parece sumamente arriesgado. De todas maneras, yo tengo que reconocer que tengo este servicio activado y me gusta, especialmente por la sencillez y facilidad de no tener este tipo de servicios desperdigados por la red, pudiendo acceder a todos ellos sencillamente allá donde estés.

Más información | MobileMe

Links for 2008-09-18 [del.icio.us]

Filed Under (varios) by Pere MAJORAL on 19-09-2008

Tagged Under : , , ,

GoGrid and RightScale Announce Cloud Computing Partnership

Filed Under (General, GoGrid, Partners, RightScale, cloud computing, partnership, press, press release) by Michael Sheehan on 17-09-2008

Tagged Under : , , , , ,

Today, GoGrid and RightScale announced a major new strategic product partnership. The full press release is below:

RightScale First to Deliver Integrated Management for Multi-Cloud Environments

Cloud Computing Momentum Builds with RightScale Support for FlexiScale, GoGrid and Rackspace

Interop New York Conference Expo (PRWEB) September 17, 2008 — RightScale, Inc., the leader in cloud computing management, today announced a major new strategic product and partnership initiative as it broadens its cloud management platform to support emerging clouds from new vendors, including FlexiScale and GoGrid, while continuing its long-standing support for Amazon’s EC2. RightScale is also working with Rackspace to assure compatibility with their cloud offerings, including Mosso and CloudFS. RightScale will be the first in the industry to offer an integrated management dashboard, where applications can be deployed once and managed across these and other clouds.

Businesses can take advantage of the nearly infinite scalability of cloud computing by using RightScale to deploy their applications on a supported cloud provider. They gain the capabilities of built-in redundancy, fault tolerance, and geographical distribution of resources - key enterprise demands for cloud providers. With RightScale, customers can leverage the leading cloud management platform to automatically deploy and manage their web applications - scaling up when traffic demands, and scaling back as appropriate - allowing them to focus on their core business objectives. RightScale’s automated system management, pre-packaged and re-usable components, leading service expertise and best practices have been proven as best-of-breed, with customers deploying hundreds of thousands of instances on Amazon’s EC2.

“Cloud computing is a disruptive force in the business world because it provides pay-as-you-go, on-demand, virtually infinite compute and storage resources that can expand or contract as needed,” said Michael Crandell, CEO of RightScale, Inc. “A number of public providers are already adopting cloud architectures - and we also see private enterprise clouds coming on the horizon. Today’s announcement of RightScale’s partnerships with FlexiScale and GoGrid is an exciting indication of how mid-market and enterprise organizations can really take advantage of multi-cloud architectures. There will be huge opportunities for application design and deployment — we are at the beginning of a tidal shift in IT infrastructure.”

FlexiScale is the only UK-based cloud computing provider and offers a unique infrastructure on demand with 99.99% SLA and many special features. For example, each customer gets their own virtual disk so that data is segregated and they can do their own low level encryption, while virtual network traffic is also segregated to deliver added security. FlexiScale uniquely offers permanent on demand storage and was the first cloud provider to support Windows. With a strong reputation for customer service, it also enables the creation of custom packages such as golden images.

Tony Lucas, CEO of XCalibre and creator of FlexiScale commented: “Without this new ability to move swiftly and easily between platforms, customers could feel locked in and much more hesitant to try and use cloud computing. RightScale’s partnership initiative is a great example of how having near interoperability between systems will enable customers to be less hesitant of moving to a new technology, which is great for everyone. It means the industry can and will grow quicker than if it was only a handful of individual companies providing distinct services that weren’t compatible with each other.”

GoGrid offers hosted cloud computing infrastructure that enables system administrators, developers and IT professionals to create, deploy, and control load balanced cloud servers and complex hosted virtual server networks. GoGrid also delivers portal controlled servers for Windows 2003 and 2008, multiple Linux operating systems and supports application environments like Ruby on Rails. GoGrid is unique in cloud computing with the availability of 32-bit and 64-bit editions of Windows Server 2008; and was named winner of LinuxWorld 2008 “Best of Show” in August. Together, GoGrid and RightScale will provide joint cloud solutions that are elegant and bring power, control and scalability to business customers.

Cloud computing for the enterprise has arrived with the GoGrid and RightScale partnership. Corporations now have few excuses not to, and multiple reasons to deploy and manage complex and redundant cloud infrastructures in real-time using the GoGrid, RightScale, and FlexiScale technologies.
- GoGrid CEO, John Keagy

Rackspace Hosting provides IT systems and computing-as-a-service to more than 33,000 customers worldwide. Combining RightScale’s technologies with Rackspace’s focus on Fanatical Support™will allow companies to focus on their business and not a disproportionate amount of resources on IT demands.

Deploying scalable, reliable applications from scratch in a multi-cloud world is a time consuming and expensive task. As a result, most organizations do not have the expertise or resources to deploy and manage cloud computing applications cost effectively and according to best practices. With RightScale’s platform, any organization can easily tap the enormous power of cloud computing for a virtually infinite, affordable, “pay-as-you-go” IT infrastructure. RightScale’s offerings provide rapid deployment, a dynamically scalable infrastructure to meet varying traffic and loads, and require minimal resources using automated tools and a centralized web dashboard for easy management backed by best practices and professional services.

###

The full press release can be viewed here.

Friday Fun Fest - A Plethora of Interesting AWS Stuff

Filed Under (Cool Sites) by AWS Editor on 13-09-2008

Tagged Under : , , , , , , ,

It is time for one of my inbox-clearing blog posts once again. Here’s a bunch of cool stuff that you might like:

  • Benjamin Kudria just wrapped up an internship at the New York Times. He wrote a detailed recap of his experience and noted that he had the opportunity to use Amazon EC2 and S3. As he notes, "had never worked with AWS before, and I was amazed at how easy it was to have my managers agree to offload a pretty significant part of our functionality to Amazon’s servers. I ended up learned a lot about S3
    and EC2!"
  • rPath will be sponsoring a Cloud Computing Meetup in New York City next week at the Westin in Times Square. The meetup will take place after conclusion of the AWS Start-Up event at the same location.
  • SubCloud is an enterprise file system implemented on top of Amazon S3 using FUSE. It has a rich feature list, lots of documentation, and is available for trial use via a time-limited license key. Files are stored directly into S3, which means that they can also be accessed using other S3 tools.
  • There’s a new release of Bucket Explorer, with support for copying, renaming, and moving files, local vs. remote file comparison, reporting, and much more. Version for Windows, Linux, and the Mac are available.
  • Adam Kalsey of Workhabit wrote to tell me that they’ve used EC2 and EBS to create a fully managed, autoscaling Drupal hosting platform. You can read more in the blog post and you can learn even more about it here. The platform takes care of all of the dirty work. As they say:
    We took everything we know about scaling Drupal and built it into a
    turn-key cluster called Elastic2 that’s pre-tuned to run Drupal. Simply
    place your Drupal app on the cluster and you’ll be able to run and
    scale your site. We continually monitor your servers and traffic and
    automatically add capacity to the cluster as needed.

    You can also watch Adam discuss his new pride and joy.

  • Next week, folks from BioTeam, Univa UD and AWS will jointly deliver a live webinar: "Cloud and Clusters: Running UniCluster in Amazon's EC2." The webinar is free but you'd better sign up ahead of time. They’ll provide an overview of HPC (High Performance Computing) using EC2, who’s doing it and how.
  • My friend Adam Rifkin sent me a link to a really interesting blog post by Andy Baio. Andy used the Amazon Mechanical Turk to uncover release dates for a list of hundreds of sound clips used inside of a music mashup. He was very happy with the quality and speed of the work — "Within an hour, all but 4 answers were submitted.  The median time to finish a request was an impressive 26 seconds." I've also tagged a couple of other good Mechanical Turk success stories on Delicous.
  • Damien Tanner wrote to tell me that New Bamboo has released Panda, an open source solution for video uploading, encoding, and streaming. Running entirely within the AWS cloud, Panda makes use of EC2, S3, and SimpleDB. Panda is available as an EC2 AMI (Amazon Machine Image) for easy launching. There’s also a complete getting started guide. Once running, Panda is accessed using a REST API.
  • Since I just mentioned SimpleDB, I should also note that we have a job opening for a Business Development Manager for SimpleDB. Details are in the job description — you’ll need 5-7 years of relevant experience, a technical degree, and great communication skills. If you click the link and land on a different job, go here and search for "SimpleDB."

Ok, I think that about does it for tonight. If you’ve built something interesting using an Amazon Web Service, drop me a line and I’ll do my best to mention it here.

– Jeff;

Introduction to OSGi

Filed Under (varios) by Tatyana on 04-09-2008

Tagged Under : , , , , ,

Cloud Services focuses on creating innovative solutions by enabling technologies we believe in to work in the cloud environments. Today we would like to present OSGi - the dynamic module system for Java™.

The article by Peter Kriens, Director of Technology for OSGi Alliance, is addressing many questions a newcomer might have on the benefits of developing with OSGi:

OSGi technology provides solutions to problems that many people simply see as intrinsic aspects of software development in Java and would not call them problems.

Well, these problems are not intrinsic and OSGi technology solves many of them. This article tries to explain why OSGi technology is relevant and why software developers, as well as strategic people, should pay attention. Some people say OSGi technology is the best kept secret of the computing industry. Let us try to change this.

So, what benefits does OSGi’s component system provide you? Well, quite a list:

Reduced Complexity - Developing with OSGi technology means developing bundles: the OSGi components. Bundles are modules. They hide their internals from other bundles and communicate through well defined services. Hiding internals means more freedom to change later. This not only reduces the number of bugs, it also makes bundles simpler to develop because correctly sized bundles implement a piece of functionality through well defined interfaces. There is an interesting blog that describes what OSGi technology did for their development process.

Reuse - The OSGi component model makes it very easy to use many third party components in an application. An increasing number of open source projects provide their JARs ready made for OSGi. However, commercial libraries are also becoming available as ready made bundles.

Real World - The OSGi framework is dynamic. It can update bundles on the fly and services can come and go. Developers used to more traditional Java see this as a very problematic feature and fail to see the advantage. However, it turns out that the real world is highly dynamic and having dynamic services that can come and go makes the services a perfect match for many real world scenarios. For example, a service could model a device in the network. If the device is detected, the service is registered. If the device goes away, the service is unregistered. There are a surprising number of real world scenarios that match this dynamic service model. Applications can therefore reuse the powerful primitives of the service registry (register, get, list with an expressive filter language, and waiting for services to appear and disappear) in their own domain. This not only saves writing code, it also provides global visibility, debugging tools, and more functionality than would have implemented for a dedicated solution. Writing code in such a dynamic environment sounds like a nightmare, but fortunately, there are support classes and frameworks that take most, if not all, of the pain out of it.

We strongly encourage you to read the entire article and see how this technology might benefit you. If you would like a more detailed introduction to OSGi, this is where you could start:

- The OSGi Architecture

- Getting Started with OSGi by Neil Bartlett.

Cloud Services makes it possible to deploy server side OSGi applications in Amazon EC2 instances. With several mouse clicks exported bundles can be uploaded to remote storage (S3) and added to profile (Launch Configuration). That is all it takes to start virtual servers (EC2 instances) containing OSGi framework provisioned with selected bundles.

Hacia el fin del data center de la empresa, persistencia en Amazon EC2

Filed Under (Desarrollo, amazon, web 2.0) by Antonio Ortiz on 26-08-2008

Tagged Under : , , , ,

Data Center en la nube

El nuevo paso en los web services de Amazon se llama “Elastic Block Store” y consiste en que añaden persistencia a su EC2, el servicio con el que ofrecen capacidad de procesamiento “en la nube”. Si unimos este paso al resto de su oferta, con SimpleDb y S3, tenemos una solución cada vez más completa para externalizar el data center de la empresa en su plataforma.

Con este “Elastic Block Store” cada vez hay menos diferencia entre lo que ofrece un hosting “normal” y soluciones como la de los web services de Amazon. En la competencia dentro de su sector, Google App Engine queda muy atrás en la competencia de las plataformas como servicio.

Claro que hablar hoy del fin de de los data centers es adelantarse mucho a la tendencia (Dion Hinchcliffe), además de obviar los problemas que acarrean las “soluciones en la nube”, tanto legales como técnicas. La mayor preocupación viene dada por la dependencia de una empresa externa a la hora de mantener el servicio, pero las ventajas por otro lado son numerosas: escalabilidad y costes son las dos que más fuerza van a tener a la hora de que las empresas se planteen la externalización de su data center. Respecto a la disponibilidad, cierto que hay caídas, pero también cabe preguntarse si, como empresa, seríamos capaces de conseguir la estabilidad que ofrece Amazon.

Hay un montón de buenos artículos que pueden ayudar a valorar el paso que ha dado Amazon y que viene a fortalecer su excelente estrategia como plataforma:

Links for 2008-08-25 [del.icio.us]

Filed Under (varios) by Pere MAJORAL on 26-08-2008

Tagged Under : , , ,

Links for 2008-08-21 [del.icio.us]

Filed Under (varios) by Pere MAJORAL on 22-08-2008

Tagged Under : , ,

Amazon’s Elastic Block Store explained

Filed Under (AWS, EC2, cloud computing) by Thorsten on 21-08-2008

Tagged Under : , , , , , , , ,

Now that Amazon’s Elastic Block Store is live I thought it’d be helpful to explain all the ins and outs as well as how to use them. The official information about EBS is found on the AWS site, I’ve written about the significance of EBS before and I’ll follow-up with a post about some new use-cases it enables.

The Basics

EBS starts out really simple: you create a volume from 1GB to 1TB in size and then you mount it on a device (like /dev/sdj) on an instance, format it, and off you go. Later you can detach it, let it sit for a while, and then reattach it to a different instance. You can also snapshot the volume at any time to S3, and if you want to restore your snapshot you can create a fresh volume from the snapshot. Sounds simple, eh? It is but the devil is in the detail!

Amazon Elastic Block Store features

Reliability

EBS volumes have redundancy built-in, which means that they will not fail if an individual drive fails or some other single failure occurs. But they are not as redundant as S3 storage which replicates data into multiple availability zones: an EBS volume lives entirely in one availability zone. This means that making snapshot backups, which are stored in S3, is important for long-term data safeguarding.

I know that folks at Amazon have thought long and hard how to characterize the reliability of EBS volumes, so here’s their explanation taken from the EC2 detail page:

Amazon EBS volumes are designed to be highly available and reliable. Amazon EBS volume data is replicated across multiple servers in an Availability Zone to prevent the loss of data from the failure of any single component. The durability of your volume depends both on the size of your volume and the percentage of the data that has changed since your last snapshot. As an example, volumes that operate with 20 GB or less of modified data since their most recent Amazon EBS snapshot can expect an annual failure rate (AFR) of between 0.1% - 0.5%, where failure refers to a complete loss of the volume. This compares with commodity hard disks that will typically fail with an AFR of around 4%, making EBS volumes 10 times more reliable than typical commodity disk drives.

From a practical point of view what this means is that you should expect the same type of reliability you get from a fully redundant RAID storage system. While it may be technically possible to increase the reliability by, for example, mirroring two EBS volumes in software on one instance, it is much more productive to rely on EBS directly. Focus your efforts on building a good snapshot strategy that ensures frequent and consistent snapshots, and build good scripts that allow you to recover from many types of failures using the snapshots and fresh instances and volumes.

Volume performance

Our performance observations are based on the pre-release EBS volumes, thus some variations on the production systems should be expected. On the one hand our pre-release tests were probably running on a small infrastructure with fewer users, but on the other hand many of these users were also running stress tests, so it’s really hard to tell how all this will carry over. Only time will tell.

EBS volumes are network attached disk storage and thus take a slice off the instance’s overall network bandwidth. The speed of light here is evidently 1GBps, which means that the peak sequential transfer rate is 120MBytes/sec. “Any number larger than that is an error in your math.” We see over 70MB/sec using sysbench on a m1.small instance, which is hot! Presumably we didn’t get much network contention from other small instances on the same host when running the benchmarks. For random access we’ve seen over 1000 I/O ops/sec, but it’s much more difficult to benchmark those types of workloads. The bottom line though is that performance exceeds what we’ve seen for filesystems striped across the four local drives of x-large instances.

With EBS it is possible to increase the I/O transaction rate further by mounting multiple EBS volumes on one instance and striping filesystems across them. For streaming performance this doesn’t seem worthwhile as the limit of the available instance network bandwidth is already reached with one volume, but it can increase the performance of random workloads as more heads can be seeking at a time.

Snapshot backups

Snapshot backups are simultaneously the most useful and the most difficult to understand feature of EBS. Let me try to explain. A snapshot of an EBS volume can be taken at any time, it causes a copy of the data in the volume to be written to S3 where it is stored redundantly in multiple availability zones (like all data in S3). The first peculiarity is that snapshots do not appear in your S3 buckets, thus you can’t access them using the standard S3 API. You can only list the snapshots using the EC2 API and you can restore a snapshot by creating a new volume from it. The second peculiarity is that snapshots are incremental, which means that in order to create a subsequent snapshot, EBS only saves the disk blocks that have changed since previous snapshots to S3.

How the incremental snapshots work conceptually is depicted in the diagram below. Each volume is divided up into blocks. When the first snapshot of a volume is taken all blocks of the volume that have ever been written are copied to S3, and then a snapshot table of contents is written to S3 that lists all these blocks. Now, when the second snapshot is taken of the same volume only the blocks that have changed since the first snapshot are copied to S3. The table of contents for the second snapshot is then written to S3 and lists all the blocks on S3 that belong to the snapshot. Some are shared with the first snapshot, some are new. The third snapshot is created similarly and can contain blocks copied to S3 for the first, second and third snapshots.

Illustration of EBS snapshots to show incremental storage of a snapshots block in Amazon S3

There are two nice things about the incremental nature of the snapshots: it saves time and space. Taking subsequent snapshots can be very fast because only changed blocks need to be sent to S3, and it saves time because you’re only paying for the storage in S3 of the incremental blocks. What is difficult to answer is how much space a snapshot uses. Or, to put it differently, how much space would be saved if a snapshot were deleted. If you delete a snapshot, only the blocks that are only used by that snapshot (i.e. are only referenced by that snapshot’s table of contents) are deleted.

Something to be very careful about with snapshots is consistency. A snapshot is taken at a precise moment in time even though the blocks may trickle out to S3 over many minutes. But in most situations you will really want to control what’s on disk vs. what’s in-flight at the moment of the snapshot. This is particularly important when using a database. We recommend you freeze the database, freeze the file system, take the snapshot, then unfreeze everything. At the file system level we’ve been using xfs for all the large local drives and EBS volumes because it’s fast to format and supports freezing. Thus when taking a snapshot we perform an xfs freeze, take the snapshot, and unfreeze. When running mysql we also “flush all tables with read lock” to briefly halt writes. All this ensures that the snapshot doesn’t contain partial updates that need to be recovered when the snapshot is mounted. It’s like USB dongles: if you pull the dongle out while it’s being written to “your mileage may vary” when you plug it back into another machine…

Snapshot performance appears to be pretty much gated by the performance of S3, which is around 20MBytes/sec for a single stream. The three big bonuses here are that the snapshot is incremental, that the data is compressed, and that all this is performed in the background by EBS without affecting the instance on which the volume is mounted much. Obviously the data needs to come off the disks, so there is some contention to be expected, but compared to having to do the transfer from disk through the instance to S3 it is like night and day.

Availability Zones

EBS volumes can only be mounted on an instance in the same availability zone, which makes sense when you think of availability zones as being equivalent to datacenters. It would probably be technically possible to mount volumes across zones, but from a network latency and bandwidth point of view it doesn’t make much sense.

The way you get a volume’s data from one zone into another is through a snapshot: You snapshot one volume and then immediately create a new volume in a different zone from the snapshot. We have really gotten away from the idea that we’re unmounting a volume from one instance and then remount it on the next one: we always go through a snapshot for a variety of reasons. The way we think and operate is as follows:

  • You create a volume, mount it on an instance, format it, and write some data to it.
  • Then you periodically snapshot the volume for backup purposes.
  • If you don’t need the instance anymore, you may terminate it and, after unmounting the volume you always take a final snapshot. If the instance crashes instead of properly terminating, you also always take a final snapshot of the volume as it was left.
  • When you launch a new instance on which you want the same data, you create a fresh volume from your snapshot of choice. This may be the last snapshot, but it could also be a prior one if it turns out that the last one is corrupt (e.g. in the case of an instance crash or of some software failure).

By creating a volume from the snapshot you achieve two things: one, you are independent of the availability zone of the original volume, and second, you have a repeatable process in case mounting the volume fails, which can easily happen especially if the unmount wasn’t clean.

Now, of course, in some situations you can directly remount the original volume instead of creating a new volume from a snapshot as an optimization. This applies if the new instance is in the same availability zone, the volume corresponds to the snapshot that we’d like to mount, and the volume is guaranteed not to have been modified since (e.g. by a failed prior mount). The best is to think of the volume as a high-speed cache for the snapshot.

Price

Estimating the costs of EBS is really quite tricky. The easy part is the storage cost of $0.10 per GB per month. Once you create a volume of a certain size you’ll see the charge. The $0.10 per million I/O transactions are much harder to estimate. To get a rough estimate you can look at /proc/diskstats on your servers. This will include something like this:

   8  160 sdk 9847 77 311900 56570 1912664 3312437 160672914 211993229 0 1597261 212049797
   8  176 sdl 333 86 4561 1538 895 51 19002 20131 0 4043 21669

which is just a pile of numbers. Following the explanation for the columns you should sum the first number (reads completed) and the fifth number (writes completed) to arrive at the number of I/O transactions (9847+1912664 for /dev/sdk above). This is not 100% accurate but should be close (I believe subtracting the 2nd and 6th numbers gets you closer yet, but I prefer an over-estimate). As a point of reference, our main database server is pretty busy and chugs along at an average of 17 transactions per second, which should total to around $4.40 per month. But our monitoring servers, prior to some recent optimizations, hammered the disks as fast as they would go at over 1000 random writes per second sustained 24×7. That would end up costing over $250 per month! As far as I can tell, for most situations the EBS transaction costs will be in the noise, but you can make it expensive if you’re not careful.

The cost of snapshots is harder to estimate due to their incremental nature. First of all, only the blocks written are captured on S3 (i.e. blocks on the volume that have never been written are not stored on S3). Second it’s tricky to talk about the cost of a snapshot due to their incremental sharing.

Summing it up

All in all it’s amazing how simple EBS is, yet how complex a universe of options it opens. Between snapshots, availability zones, pricing, and performance there are many options to consider and a lot of automation to provide. Of course at RightScale we’re busy working out a lot of these for you, but beyond that it is not an overstatement to say that Amazon’s Elastic Block Store brings cloud computing to a whole new level. I’ll repeat what I’ve said before: if you’re using traditional forms of hosting it’s gonna get pretty darn hard for you to keep up with the cloud, and you’ve probably already fallen behind at this point!

Amazon EBS - Tool and Library Support

Filed Under (Amazon EC2) by AWS Editor on 21-08-2008

Tagged Under : , , , ,

This is a companion post to my earlier post — Amazon EBS (Elastic Block Store) - Bring Us Your Data. In the other post you can read about the features of EBS. This post goes into more detail on the tool and library support that has been built by our community of third-party developers.

Here are some tools:

And some libraries (some of the third parties will finalize their support in a day or two):

– Jeff;

PS -  I'll be updating this post a couple of times in the wake of the EBS launch so come back again soon.

Vertica / Sonian / Amazon Webinar

Filed Under (Web Services News) by AWS Editor on 21-08-2008

Tagged Under : ,

Sonian_vertica_amazonEarlier this year I talked about the unique and powerful AWS-powered solutions offered by Vertica and Sonian.

Tomorrow (August 21st), I will be taking part in a unique, three-party webinar. In the webinar you’ll get to hear from me, from Vertica Field Engineering Director Omer Trajman, and from Sonian CTO Greg Arnette. The webinar will start at 8 AM PST.

In the webinar you will learn how cloud computing is changing the economics of data warehousing and large-scale analytic database applications. You’ll hear how Sonian has built and launched a cloud-based digital content archiving system on top of Amazon EC2 and the Vertica Analytical Database for the Cloud.

The webinar is free but you do need to register ahead of time. Hope to see you there.

– Jeff;

ABOUT

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Quisque sed felis. Aliquam sit amet felis. Mauris semper, velit semper laoreet dictum, quam diam nec...

ReadMore