Graduate School in Physics and Astrophysics ------------------------------------------- ANNUAL REPORT ------------------------------------------- Fill with a text editor (without TAB or formatting) Repeat fields for each course as necessary. ------------------------------------------- name: Dario BERZANO email: dario.berzano@to.infn.it ciclo: 26 year completed (1,2 or 3): 2 supervisor: Prof. Massimo MASERA ------------------------------------------- GRADUATE SCHOOL COURSES (only completed courses, with examination passed in the year) code: ??? (webpage only shows 2012 codes but CFUs and hours are different) title: Tecniche di Analisi Dati !!! 2011 !!! teacher: Luciano RAMELLO hours: 20 (5 CFU) code: ??? (webpage only shows 2012 codes but CFUs and hours are different) title: CP Violation !!! 2011 !!! teacher: Fabrizio BIANCHI hours: 16 (4 CFU) ------------------------------------------- COURSES FROM OTHER GRADUATE SCHOOLS (only completed courses, with examination passed in the year) school: CERN openlab title: Numerical computing teacher: J. Arnold, M. Corden, F. de Dinechin, T. Mattson hours: 14 !!! NOTE !!! Technically this is an actual course with evaluated exercises, and not a workshop or summer/international school - I was not sure where to insert it exactly. ------------------------------------------- SUMMER SCHOOLS, INTERNATIONAL SCHOOLS (only those attended in the current year) title: GridKa school place: Karlsruhe webpage: http://gridka-school.scc.kit.edu/2012/ days: 5 talk or poster (Y/N): N title: IEEE NSS MIC RTSD 2012 (School on Medical Imaging)** place: Anaheim, CA webpage: http://www.nss-mic.org/2012/ days: 8 (1 day of school, 7 days of conference) talk or poster (Y/N): Y ** both school and conference (also reported below as a conference) ------------------------------------------- CONFERENCES, WORKSHOP (only those attended in the current year) title: IEEE NSS MIC RTSD 2012 (Medical Imaging Conference)** place: Anaheim, CA webpage: http://www.nss-mic.org/2012/ days: 8 (1 day of school, 7 days of conference) talk or poster (Y/N): Y ** both school and conference (already reported above as a school) title: Workshop congiunto INFN+GARR place: Napoli webpage: http://agenda.infn.it/internalPage.py?pageId=0&confId=4801 days: 5 talk or poster (Y/N): Y ------------------------------------------- VISITS AND STAGES (only those done in the current year) institution: CERN place: CERN, Genève CH starting date: Sep 1, 2012 days: 1.5 years !!! NOTE !!! Starting from Sep 1 I am a doctoral student at CERN --------------------------------------------------- Research activity/Publications in the current year (max characters 2500) By continuing the work we've started one year ago at the Computing Centre of INFN Torino, many prototypes and concepts were put in production. Starting from January 2012 (to May) the whole computing infrastructure (the so-called Tier-2) has been entirely converted into a cloud computing centre where all machines run as virtual machines. The main customer (Grid and ALICE) now accesses Grid resources transparently (nothing changed on client's side) as virtual machines. An advantage of such approach is that there is now a single infrastructure, based on OpenNebula, that can host several dynamically-sizable virtual farms, and not only the Grid farm. This means that it has become extremely easy for a physics group needing a relatively small amount of machines for a short time, to ask for a virtual farm allocation on top of the cloud infrastructure, instead of buying new hardware that stays unused most of the time and needs to be configured manually. As soon as the virtual resources are no longer needed, they can be dedicated to other use cases in seconds. This approach, apart from being convenient both from the economical and management point of views, helps experiments and projects to expand and reduce their amount of resources "on the fly" without a prior knowledge on the needed resources. It is very likely that a "spot" user of a cloud infrastructure does not know in advance the needed amount of resources: with the OpenNebula infrastructure we can just instantiate more or less copies of the same virtual machines in seconds to dynamically match requirements to assigned resources. Three customers started using our system with proficiency. A theoretical physics lattice computation (Prof. G. Passarino) requiring a large amount of RAM memory on a single machine for maximum two weeks has been allocated at first, and resources immediately released when computations were finished. Our second customer is a Ph.D. student of E. Vittone's research group, working on IBIC (Ion Beam Induced Charge) technique, J. Forneris: in this case, the request was for a virtual farm of a variable number of relatively "small" virtual machines running a MPI numerical simulation. The third use case is a conjoint work between different Italian institutes for a project aimed to help medical doctors to detect suspect nodules in a massive amount of lung CTs. Since lung cancer, one of the most deadly cancers, can be surgically removed with a high success rate if detected in time, massive screening of patients can potentially save many lives. Unfortunately, lung nodules are very difficult to be detected by human eyes, since they can be very small in size. This is why a series of algorithms has been developed by the INFN MAGIC5 collaboration to perform automatic detection of suspect nodules along with their significance. A project named WIDEN (Web-based Image and Diagnosis Exchange Network) has been developed by diXit s.r.l., a INFN spin-off, to manage the workflow and trigger the detection algorithms through a familiar web interface for the medical doctor, who uploads CTs through it and then get reports back via emails and short messages. WIDEN represents the front-end to the algorithms. My effort was to develop a back-end, i.e. a computing infrastructure to effectively run the algorithms. I have realised a prototype of a lightweight elastic cloud computing farm driven by a batch system. As long as the CTs to be processed are uploaded, they are placed in an execution queue and analyzed in order on different virtual machines being part of the virtual farm. When the number of enqueued items trespasses a configurable threshold, the virtual machines make use of the OpenNebula APIs to clone themselves in order to reduce the workload. As soon as one of the virtual machines becomes idle, it gets automatically turned off in order to free resources for other users. This work has been presented on the IEEE NSS MIC RTSD 2012 conference and a preparation of a paper is ongoing.