Julia is generally very memory efficient. It has around 200 Mb overhead to launch, but it gives you very good tools to write memory efficient data manipulation. (The one exception is that String
has unfortunately high overhead for small strings currently).
I want to point out to observers of this thread that there seems to be some funny business going on here and in other discussions related to InMemoryDatasets. Specifically, there appear to be a sizable number of people in these threads who are masking their location and there are indications that they may be coming from a common location/network. This does not appear to be straight up sock puppetry (a la Henning Rousseau)āposters seem to (mostly) actually be different peopleābut there does seem to be some sort of hidden agenda here. My guess is that the goal is to make it appear that there is more widespread dissatisfaction with DataFrames in the Julia community than there actually is. I feel I have to post this warning so that participants in the conversations are not taken in by any deception.
To those people in the thread who are doing thisāfirst of all, welcome! You mostly appear to be new to the Julia community and you come bearing code, which is great! However, please consider taking a different approach here. First, stop trying to start flame wars with DataFrames developersāthat is not cool. Also, please stop trying to appear to be independent people who just happen to all be fed up with DataFrames. Itās fine if you all work together and are collectively frustrated with DataFrames. Thatās absolutely okājust donāt be deceptive about it. It is also totally fine to have developed a fork of DataFrames and compete with it. The MIT license very much allows that and if you think you can do better, by all means, give it a try.
(One thing that does need to be fixed is the license copyright notice: InMemoryDatasets appears to be derived from the DataFrames code base and the MIT license does require keeping the copyright notice intact, so if you could fix that, that would put the project in legally upstanding footing.)
Assuming that Iām correct about the ācompany that is collectively interested in improving on DataFramesā interpretation of whatās going on, my suggestion would be: take a beat, reset the conversation, be direct about working together and that you have created and are promoting InMemoryDatasets as an open source alternative to DataFrames. Maybe Iām wrong about my interpretation and if so, feel free to let me know here or privately whatās actually going on.
I think it would be good to give people a shot to give alternative explanations since I can imagine several hypothetical reasons for this beyond a desire to seem larger in size than appropriate.
Yeah, Iām absolutely open to something else being up hereāfeel free to DM meābut something odd is going on and it seemed like people ought to be aware.
A post was split to a new topic: Julia PR team?
The InMemoryDatasets new users have used all but āsubbtle waysā
I really want to make sure that this does not become accusatory and that we donāt pile on. Nothing that has been done is terrible and itās really exciting to have people interested enough in data wrangling in Julia to take a crack at a new package like this. Itās a lot of work and itās generously contributed for anyone to use. Who amongst us hasnāt gotten a little vehement in our defense of Julia against its alternatives? Pointing out differences between similar packages can surely easily go the same way without ill intentions. Weāre not sure what the motivation is for the funky accounts, but letās please, please, please letās give people the benefit of the doubt.
Perhaps itās best to leave it at that (just the facts) and avoid speculating about intentions. That way anybody from the relevant group has an opportunity to explain if they choose, without feeling defensive.
Yeah, but allow me one remark: I donāt want to read anymore posts about about the benefits of competition vs. cooperation (which we could discuss elsewhere). If we are going Darwinian in an open source forum, Iām out.
I think these kinds of posts shouldnāt be included in a public discussion, because they are more dangerous to community than any use. Including author there are less than 30 people involve in this topic and call it sizable is a little rash, I also searched topics about InMemoryDatasets and I found 7 of them so far which one of them is this announcement and two of them are also mine.
Sorry to object: Iām thankful for this kind of transparency.
On the one hand I can imagine frustrations of developers like @sl-solution to get improvement PRs rejected by mature libraries due to compatibility reasons which lead to these kinds of new developments.
On the other hand this is a bad sign, which needs to be addressed:
This is a very commonly misunderstood detail of the MIT license. Seems like a mistake.
Yeah, thatās a very common mistake that Iāve made myself before. I think so long as people are gracious about fixing these things when brought to their attention and itās not some clear pattern of bad behaviour, itās best that we all assume that MIT license violations are accidental.
Awesome! Right package in the wrong language.
If you donāt mind: could you help me understand why you joined a Discourse forum for a language you think is bad exactly 1 minute before posting a comment about a very specific package? What was the specific chain of events that led to that happening? It seems like a remarkable coincidence.
Sincerely, the most insulting thing about this comment is not the puerile attack to the language but how it genuinely underestimates the community efficiency to spot a troll on sight.
This comment is intended to the original author of this post within a very specific context which Iām not obligated to share with you. Please donāt make any further assumptions.
I would appreciate it if you could point me to specific guidelines that bans such an āinsultā to the language.
Who said it was banned?
Calling someone out for poor behavior can be done regardless of whether that behavior is specifically banned.
And poor behavior is not ok just because you have a hidden agenda, as you yourself admit.
From the first reactions here I indeed conclude this to be a significant improvement. Iād also expect that the author put some thought into which language to use. So where do you think he misjudged?
Granted. But if you are not interested in discussing this publicly you could use Discourse PM instead.