GSoC 2010 Projects

If i have to pick from you list, this will probably be Clustering. I don’t need it much for myself, but it is haighly demanded in the community.

You don’t have to pick from my list. The question is what would the list be? I just gave these as some examples, but what would you want the students to work on if it could be anything related to this and useful for others or yourself in regards to this FOSS?

Thanks

I know i don’t have to pick from your list, i just don’t have mine Clustering seems a nice project, but maybe too bit challenging? Also multi-domain support has high votes in Jira.

OK, great. GSoC project already has an entry for XMPP. I have arranged ability to add to their wiki of project ideas. For these four ideas, what can I use for descriptions of these topics? I can search around and post back, maybe tomorrow, but if someone has pointers to Community Documents or JIRA links. I can formalize them better.

Thanks!

Multi-domain OF-162

It is to late for IgniteRealtime to register with this years edition of GSoC, but the XMPP Standards Foundation has already been accepted as a mentoring organization. The XSF GSoC project does accept student ideas for specific software projects.

Personally, I would welcome projects that relate to:

  • Portable Import/Export for XMPP servers: http://xmpp.org/extensions/xep-0227.html
  • Reducing the database footprint of Rosters in Openfire (possibly replacing the relation persistence model with a new model based on nosql-techniques).
  • Constructive performance enhancements, which would include finding a structural solution to Openfires Achilles’ heel.
  • Constructive spec compliance enhancements.

On top of this, Safa Sofuoğlu, who already did a GSoC for Openfire a few years ago, is busy porting Openfire from Apache MINA to Netty. Perhaps this effort could be included too.

Student application period opens in two days! If you want to be a part for this years GSoC, be quick to formulate and submit your application! Please refer to the XSF GSoC '10 project page for the details.

The XSF GSoC '10 project page can be found here: http://wiki.xmpp.org/web/Summer_of_Code_2010

I have added these to the XMPP Standards Foundation ideas wiki. I will try and find additional ways to promote these efforts from the GSoC organization contacts.

Thanks!

http://wiki.xmpp.org/web/Summer_of_Code_2010_Project_Ideas#Specific_Server_Proje cts

Since the ideas page on XMPP wiki links here, I would like to explain a little about the MINA to netty port.

I’ve been running a busy MINA-based server (non-XMPP) since 18 months, which handled more than 40k concurrent users at peak times. Recently I switched to netty, with the main reason that MINA was not secure. I have seen that it’s easy for an attacker to systematically crash a MINA 1.x server. MINA developers are aware of the vulnerabilites and probably they will be fixed with MINA 2.0, but I don’t know when the stable version will be released.

On the other hand, MINA had some issues with compression, and Guus had suspicions about other issues that may be caused by MINA.

JM-1115

After I switched to netty, I saw an amazing performance boost that netty simply halved the CPU usage for my application. The gain may not be that much for Openfire, but I believe it’s worth trying.

Folks,

I was asked to expand more on these ideas in the wiki by providing a more complete proposal than I currently have. This would involve for each listed project to explain in more detail what work is involved to meet the effort, and how difficult it would be. Below is an example of proposal. If you could attach the difficulty rating ‘Easy, Medium, or Hard’, plus some more in depth background for the listed projects, then I can add to the wiki. You can send me private mail or post back here. If you do not have time for this, don’t worry! I will try and formalize for you based on information already posted here, their adjoining links, via community searching, and update the wiki in a few days.

Thanks!

http://wiki.xmpp.org/web/Summer_of_Code_2010_Project_Ideas#XMPP_for_the_social_w eb

Thanks! I went ahead and added some explanation to my proposal.

Folks,

I updated the wiki just now and added better descriptions of each proposal. I also followed the format of earlier proposals, adding difficulty ratings and contacts.

Hope that helps

Hi all,

I am an undergraduate student of Computer science at IIT Kharagpur , India. I am interested in applying for GSoc 2010. I was intrigued by the problem described as Openfire’s Achilles’ Heel. (http://www.igniterealtime.org/community/docs/DOC-1925). I recently worked in a project that involved setting up a peer-to-peer file sharing network over bluetooth which also had an IM facility. As I understand it our implementation was similar to that of Openfire in that every mobile had a thread pool and everytime it recieved a connection request , a thread from the thread pool was assigned to that particular connection.

We too faced to problem of network crashes eg. when all the threads in the pool got blocked (say waiting for a blocking input stream) the user was helpless. He had to wait for either of the threads to stop. We solved this problem by implementing a Scheduler and I believe that the same approach amy be effective here.

I did a bit of background study and discovered that currently the openfire server does not have a scheduler.

As for the question of implementation… I am currently working on a formal proposal. I am trying to figure out which scheduler would work best… how would it be implemented…How do we insure the scheduler is fair and other such issues.

But first I would like to know if such a project is feasible and more importantly does it solve the problem?

Your comments please…

Regards,

Anshul Singhle

Hello,

The usage of schedulers (or more genericly, ExecutorServices), would indeed help a long way into solving the Achilles’ Heel problem. Openfire does make use of such constructs, but this can (and should) be improved.

Most likely, implemented Openfire functionality should be grouped into individual parts. Each of such a part should then be assigned a set of resources (don’t limit these to schedulers/cpu-related resources alone. Other resources such as database access might be of interest too!). The assigned resources should then be made available in some way to the implementation parts.

I feel that the interesting part of this proposal would not be the introduction of schedulers, but finding a way to make sure that every scheduled task is finite, does so in a timely manner, does so without breaking functionality, and does not exhaust its available resources (or at least, make sure that exhausting resources in one part does not affect other parts of the system).

I’m be interest in your proposal!

Hi,

can you direct me to the current scheduler implementation?? also can you give me some pointers on what i should include in my proposal…i already have a draft but i would like it to be as through as possible.

also can u shed some light on the various types of requests handled by the openfire server… this information is indispensible for any implementation of a scheduler.

Regards,

Anshul Singhle

As I said - don’t focus on thread pools / schedulers alone (a number of suchs thread pools are identified in in the Achilles’ Heel article). This will solve but part of the problem.

I would suggest that you base your proposal on finding a way to “sandbox” the available resources, if you will. Resources that are available to one sandbox should not be influenced by anything that is going on in another sandbox. Coming up with a way to first identify parts of Openfire that are to be sandboxed and then implementing such sandboxes is very, very likely to keep you busy all summer (do take into account that we will require that this solution works for the clustered version of Openfire too!)

OVERVIEW OF SANDBOXING TECHNIQUE.

T1,T2,…,Tn are types of interactions eg. PEP session or MUC event etc…

  1. Create a PacketInterceptor that recieves XML requests directly from the socket. This interceptor classifies the packets as T1,T2,…,Tn (say) and accordingly forwards the packet to the RequestHandler class for each type.

  2. Create a RequestHandler class for every type…It will have a queue that stores the requests. It has the ability of spawning child threads that perform a specific actions related to that type.

  3. Multiple such child threads are created(number can be decided…minor issue). Each child thread had access to a database connection pool(this pool is associated with the parent RequestHandler class .)Now each thread can choose any number(?) of connections from the pool.

  4. The RequestHandler class has a database pool manager. Whenever any child thread wants access to connection(s) from the pool this manager will decide if and when to give that access. <implementing this is an issue. i’m currently working on an implementation schemel>.

  5. I also plan on implementing a inter-thread communication protocol. if one type of thread needs some information from another type of thread, it will send a request to the request handler of that type. There will be a timeout mechanism to ensure that that this thread is not blocked because the other thread is not giving a reply…if normal function can be resumed without that information, it will be otherwise this thread will send the request again and the timeout time will be slightly increased. To prevent frquent timeouts a backdoor will be implemented on the RequestHandler class that will give priority to requests from fellow threads over requests from clients. Also, essential requests will be given priority.

I hope that this scheme solves our problems without creating too many problems of its own. Do you predict any problems with this? Is this scheme sufficient or does it need some more features? Any particular feature you can think?. I’d be glad if you could answer these questions.

Oh! and i almost forgot, I have a scheme that ensures that this setup will work in a clustered version. I believe to implement clustering no modifications need to be made to the above scheme in general. Rather we will need to (slighly)modify the clustering implementation and/or create some hooks

Another point: The exisiting thread pool and database connection pool would be removed(naturally)

Have given a detailed formal proposal at http://gsocopenfire.wordpress.com/2010/04/05/gsoc-proposal/ please checkout…

will be apllying formally very soon

Anshul, I have read your proposal, and sent you some email in review.

Hope that helps

Hi,

I read your review email and it has certainly given me a lot to think about… But first I would like to explain what my proposal actually intends to do.

The basic aim is to modularize the server without any loss in functionality… Why is this necessary? Because as I understand it The Achilles’ Heel problem basically arises because the server is using the same set of resources to handle all its tasks. As a result of this if some task blocks the resources ALL tasks are affected. If the server is modularized in case of heavy loading it can still provide a limited service and does not completely die down.

I understood from your review that you are not sure whether the problem I am trying to solve is really the problem that needs to be solved. I studied the way the openfire server works and this is my interpretation of the problem.

You mentioned that :

“it may be useful to take a step back, and
merely create conditions that reproduce the error in Openfire, then
plot out how the server is responding by looking at process
utilization, profiling, and memory growth.”

Yes that is an extremely valid point however I do not have the resources to do the above and that is why I want to work with you guys…If the real problem turns out to be something else I will draw out a plan to solve that problem. Please understand that my proposal is merely an attempt to solve a problem as I see it…and my analysis is based on my study of the code and my interpretation of how openfire works. If i am given the chance to work with you guys my first priority would be to identify and define the problem and if the problem turns out to be something else then I would draw out a plan to solve that problem. My proposal is merely a snall demonstration of whole cycle and I admit that I too have no way to identify the real problem. Identifying the real problem is a major step in solving the Achilles’ Heel problem .

At this point I want to ask a small favour from you…Assuming the problem is what I anticipated… then does my proposal solve it??

Thanks for the heads up on the testing and implementation related issues. I forgot to think about them…Rest assured that i am currently working on a proposal for that too…“In the proposal, there is no mention of how you are planning on
setting up a test bench” working on it…

I appreciate you taking out time to review my proposal… Thanks a lot!!

Another note : i feel a little stupid asking this but should I draft another proposal that is more general to the type of problem??