Interested in Summer of Code

proposal

#1

Hello everyone,
I am interested in applying for Google Summer of Code as a student this year under Julia. I would like to work on building a package similar to BinDeps, which would enable us to create data providing packages easily (http://julialang.org/soc/projects/general.html#standardized-dataset-packaging).

I would like to know if anyone from the Julia Community is interested to be my mentor for this project.

I remain in anticipation of a favourable and quick response.

Gramercy.


#2

Hey @americast, glad to hear you’re interested in this project. I think the best way forward would be to start discussing your design/approach with the community here. What are your current thoughts on how to go about this project?


#3

@MikeInnes, thanks a lot for your response! I had opened an issue regarding the same on Github (https://github.com/JuliaML/Roadmap.jl/issues/14) and I am working out as instructed by the community. I am going through various JuliaML data packages to see what is already there. I am also creating a new BinDeps-like package which will make inclusion of such packages easier.

Gramercy…


#4

@MikeInnes, can you guide me a bit on what the “search and download” feature of the package should have? Should there be a list already provided in a specified format containing the URLs (say JSON) or should it search online? It would be nice if you could give me a small list of facilities you wish the package to have with respect to searching and downloading ML datasets. I plan to do this part early.

Gramercy.


#5

I don’t think that’s something to worry about too much. It could just be a function like search("iris") exported by the package, for example. I think the more important thing will be to have the dataset downloading tools in place and then build search on top when it’s necessary and you have a few example datasets implemented.

I don’t have much of a strong sense of what this project should look like myself, so I think you’d need to fill out the details and set the scope in your proposal. (Others may chip in with their own ideas of course.)


#6

​​@MikeInnes I am developing the package here: https://github.com/americast/DataDeps.jl (Documentation: https://github.com/americast/DataDeps.jl/wiki). Please have a look. Though the project is in it’s infancy now, it would be nice if you would kindly provide some feedback or create issues in the repo. It would help me in making further plans.

Also, just yesterday I came across this package: https://github.com/JuliaML/MLDataUtils.jl/. Though it’s purpose is entirely different from DataDeps, it provides many features which DataDeps should provide but to be used in a different manner. I was wondering if it would be better to build DataDeps on top of this package rather than creating a completely new one. Pl guide me in this regard.

Gramercy again…


#7

Conversation has moved to:

We might want to close this thread.


#8