Home
The Software Observatorium
notes on ossd process discovery research
Recent Entries 
14th-Apr-2006 02:42 pm - Google's Summer of Code 2006
It looks like Google is hosting another Summer of Code. From the FAQ:
Summer of Code 2006 is a program that offers student developers stipends to create new open source programs or to help currently established projects.

It was reported as big a success last year and there is already a bit of excitement on the IRC channel. If you are interested in participating as a student or a mentor, check it out as the deadlines for applying for both positions are fast approaching. Yay for open source!
7th-Dec-2005 07:23 pm - Candidacy Paper Abstract
From the paper that won't end, I bring you, an abstract! There are three topics: data available in OSSD, processes executed in OSSD, and tools/techniques that might help analyze the data to extract the processes. The horrid length of time it's taking me to write the paper can be blamed directly on this three-pronged attack and the time spent weeding out would-be fourth (and beyond) prongs. Without further adieu...

In their 1993 study of software process capture and analysis, Alex Wolf and David Rosenblum argued that a hybrid, manual and tool-aided, approach to capturing process data is necessary because purely automated approaches are “intently biased towards the computerized aspects of processes, while purely manual approaches are inefficient for high volumes of data.” [Wolf and Rosenblum, 1993]. Since the time this statement was made, more and more aspects of development have become computerized. This is especially true in open source software development communities where most if not all development is computerized. With computerization came tools for analyzing structured aspects of the new data., including workflow mining tools such as Balboa, developed Dr. Wolf's future student Jonathan Cook. Worklow mining efforts produced many fruitful results in the latter half of the 1990s and beyond, and is well complimented by a flourishing breadth of tools mining other dimensions of software repositories. These tools, however, fail to incorporate unstructured data in software repositories, thereby providing an incomplete characterization of the processes (and other phenomenon) they discover. The purpose of this paper is to examine tools and techniques for automating the analysis of semi and unstructured data to compliment the results already achievable via structured analysis of software repositories in order to discover software development processes. In doing so for the reasons above, and more explained below, this study also takes in in-depth look at data available and processes enacted in open source software development (OSSD). The goal is not to prove that discovery is completely automatable for any or every process, but merely to see what can be achieved with current unstructured data analysis technology.
12th-Oct-2005 02:08 pm - Open Office Migration
I've recently decided to try making the switch to open office, with the upcoming release of version 2.0. For my purposes, it seems to have a lot of potential, but the learning curve has proven steeper than I'd hoped for. As I'm prone to forget how to solve the various challenges I've faced thus far, I'm going to try documenting them here. In no particular order:

The gory details )
14th-Sep-2005 02:38 pm - Brief Updates
Well, I'm back after an internship hiatus. Sad to leave, as it was a good experience, but I have lots of work to do now that I'm back. My project updates page is up and running (and no longer static) and I'm turning my full focus towards my candidacy paper until that finishes. As such, development work is on hold. I'm hoping to have my candidacy paper finished by the end of October and my topic defense by the winter holidays. Still working on the precise schedule, though. I'll also have to see about committee member availability.

A brief bit of research related stuff- I read the chapter on "The Open Source Process" (don't get me started on that title) in Sean Egan's book on Gaim development and, interestingly enough, it features an organizational structure as a pyramid, in contrast to the onion diagrams we typically use. Interesting that he self-identified (perhaps even unknowingly ;) their organizational structure that way. Richard Gabriel and Ron Goldman also talk about the onion diagram in their book, "Innovation Happens Elsewhere: Open Source as Business Strategy."
7th-Aug-2005 02:42 pm - Candidacy Paper outline
As part of my degree, I must write a survey of my field of study. In my case, the topic is discovery (and modeling) of open source software processes. I've included Free/Libre in the subsection titles, but I haven't studies free/libre projects much thus far. What follows below is my working outline (.doc version here. Feedback is welcome, by which I mean appreciated. I apologize for the formatting and the length. Keep in mind, it's a draft :)

Open Source Software Process Discovery
1.Abstract
2. Introduction
3. A Brief Tour of Traditionally Held Notions of Software Development Processes
3.1.Measurement ? {CMM, COCOMO}, Lifecycles, clean-room, Agile Methods {XP, scrum, the agile process, more…}, RUP (rational unified process)
3.2.The focus here should be on lifecycle models and modeling, rather than measurement
3.3.Intraorganizational processes
3.3.1.Intraorganizational
3.3.1.1.Technical
3.3.1.1.1.As below in OSS processes
3.3.1.2.Sociotechnical
3.3.1.2.1.As below…
3.3.2.Interorganizational
3.3.2.1.Technical
3.3.2.1.1.As below in OSS processes
3.3.2.2.Sociotechnical
3.3.2.2.1.As below…
4.A Primer on Open Source
4.1.Vary in size (LOC, number of individuals, etc)
4.2.Vary in motivation
4.2.1.Free Source
4.3.Open Source
4.4.Licensing discussion?
4.3.Vary in openness
4.4.Vary in terms of community composition
4.4.1.Unincorporated individuals
4.4.2.Foundations
4.4.3.Corporately-led/backed communities
4.4.3.1.NetBeans, Open Office, Eclipse, Apache & Mozilla?
4.4.4.Open source corporations (i.e. Mozilla Corp)
5.Software Processes Under Investigation
5.1.Intraorganizational
5.1.1.Technical
5.1.1.1.Requirements and Release
5.1.1.1.1.How are these processes different from traditional/textbook development processes?
5.1.1.1.2.Implications for process discovery
5.1.1.1.3.Implications for process modeling
5.1.1.2.Quality Assurance
5.1.1.2.1.How are these processes different from traditional/textbook development processes?
5.1.1.2.2.Implications for process discovery
5.1.1.2.3.Implications for process modeling
5.1.1.3.…
5.1.2.Sociotechnical
5.1.2.1.Role Migration
5.1.2.1.1.How are these processes different from traditional/textbook development processes?
5.1.2.1.2.Implications for process discovery
5.1.2.1.3.Implications for process modeling
5.1.2.2.Leadership
5.1.2.2.1.How are these processes different from traditional/textbook development processes?
5.1.2.2.2.Implications for process discovery
5.1.2.2.3.Implications for process modeling
5.1.2.3.Conflict Negotiation
5.1.2.3.1.How are these processes different from traditional/textbook development processes?
5.1.2.3.2.Implications for process discovery
5.1.2.3.3.Implications for process modeling
5.1.2.4.Control
5.1.2.4.1.How are these processes different from traditional/textbook development processes?
5.1.2.4.2.Implications for process discovery
5.1.2.4.3.Implications for process modeling
5.1.2.5.Collaboration
5.1.2.5.1.How are these processes different from traditional/textbook development processes?
5.1.2.5.2.Implications for process discovery
5.1.2.5.3.Implications for process modeling
5.2.Interorganizational Processes
5.2.1.Technical
5.2.1.1.Similar to above
5.2.2.Sociotechnical
5.2.2.1.Similar to above
6.Existing approaches to process discovery
6.1.Manual
6.1.1.Field-Study Ethnography
6.2.Automated
6.2.1.Event capture
6.2.1.1.Cook/Wolf
6.3.Hybrid
6.3.1.Inadequacy of existing approaches for FLOSS Process Discovery
7.Existing approaches to process modeling
7.1.Informal
7.1.1.Narratives
7.2.Semi-Formal
7.2.1.Flow graphs
7.3.Formal
7.3.1.Petri-nets
8.Requirements for Discovery and of FLOSS Process Discovery Techniques
8.1.AKA The framework
8.2.AKA Motivation for a Multi-Modal Approach to Discovery and Modeling of FLOSS processes
8.2.1.AKA The dissertation prospectus
9.Requirements for Modeling of FLOSS Processes
9.1.Towards a Multi-Modal Approach to Discovery and Modeling of FLOSS Processes
10.1.Discovery
10.1.1.Process Meta model
10.1.2.Process reference model
10.1.3.(Partially?) supervised index-based learning from events capture in context.
10.2.Modeling
10.2.1.Narrative
10.2.2.Rich Hypermedia + Use Cases
10.2.3.Flow Graph
10.2.4.Formal Model (PML)
11.Preview of A Promising Implementation (AKA Future Work)
11.1.PADME – Process Architecture Discovery and Modeling Engine
11.2.More on this in the Topic Proposal
12.Conclusions
13.References
10th-Jul-2005 02:43 pm - Publication archive move...
Due to technical buggery, I've moved my publications to a server I have more space and control over. I've meta-refresh redirected my main publications page to the new server, but if, for whatever reason, you'd linked directly to a file, it's now broken. I tried to have apache redirect all queries to the new server, but it seems that functionality is disabled. Such is life. The new server is:

http://rotterdam.ics.uci.edu/papers/

I'm betting that no one with access to the machine will mess with things and they can live here happily ad-infinitum, so cross your fingers.

On a separate note, I'm going to go live with a more day to day research journal (rambling about design issues in my code and such) in September when I'll be able to set up the database on my server (it is remote from me at present and I disabled remote access) and keep this one for larger announcements. For now, the frontpage is here (with non-working links):

http://rotterdam.ics.uci.edu/journal/
26th-Jun-2005 11:32 am - ProSim05 paper
It occurred to me that I never posted my ICSE-ProSim paper to my pub list. Here it is:

» Jensen, C. and Scacchi, W. Modeling Recruitment and Role Migration Processes in OSSD Projects, Proceedings of the fifth workshop on Software Process Simulation and Modeling ICSE-ProSim, St. Louis, MO, USA, May 14-15 2005. (Abstract)[March 5, 2005]
17th-Jun-2005 08:38 pm - Google Summer of Code
In case you haven't seen it, Google has expanded their Summer of Code program to allow more projects. I'm not sure if they're accepting more applicants or if they're handing out more awards to applicants who have already applied (my guess is it's the latter). Either way, it looks to be a great opportunity and I'm excited to see what comes of it.
6th-Jun-2005 08:01 pm - Wil van der Aalst: Workflow Mining
Wil van der Aalst has done a lot of work in mining workflows. It's about time I put the link up to his research page. Vladimir Rubin and Wilhelm Schafer are doing workflow mining of source versioning repositories inspired (I think that's the best term based on our few brief conversations) by this work and business process mining, though I don't know if they have a research project Web presence yet. Enjoy!
29th-May-2005 05:22 pm - SPW 2005
I've posted photos from SPW 2005 and my afternoon at the Summer Palace (after the workshop ended) here.

I'll have a writeup on the experience when there is more time (and more photos when I receive them).
This page was loaded Nov 30th 2009, 4:05 am GMT.