Doubts about language

15 posts were split to a new topic: Licensing speculation

To clarify this point: copyright law covers only source code – copyright does not and can not apply to the underlying mathematical idea. Math cannot be copyrighted. Literally transcribing source code should be considered a derived work and only done under the original license terms. Implementing an algorithm “from scratch” based on a mathematical description is not a derivative work.

2 Likes

I’ll add that this depends on what do you mean by transcribe. If you mean something automatic then it’s in general impossible or useless. The reason julia is fast is not because it has a good compiler, in fact we have a pretty terrible one. The performance is tightly bound to many language design decidions making them easier for the compiler to analyse. It is of course possible to use a R-like syntax but it won’t have a semantics anything close to R so it’ll be of very limitted usefulness. OTOH, if you implement a lot of R features, it’ll be very unlikely to perform better or evven comparable to R since our runtime is not optimized for it.

5 Likes

Surely it cannot be illegal to transcribe code? You mean that you cannot release transcribed code under just any licence?

Edit: or exploit it commercially and so on…

2 Likes

The nice thing about that statement, is that a lot can be done after 1.0 to greatly improve the performance of the parsing, lowering, and JITting code, and to make the compiler do a much better job at optimization.
What would be pretty much impossible to fix later on would be the underlying language design, which is a pretty wonderful one.

3 Likes

There are many R language users who use the betareg package, for example.

I wish they could enjoy the Julia language and that it had functions to work with the Beta Regression Model. I’m not saying I want to put useless features of R on Julia, that would be dumb of me.

What I mean is to place mathematical functions in Julia in a convenient way in the Julia language so that statisticians can enjoy the use of the Julia language and become interested in language.

1 Like

I’m starting to converge for the idea of waiting for the language to mature a little. Maybe hopefully version 1.0 will be released to venture out with a little more peace of mind.

I see a lot of statisticians who program a lot in C and R by no using Julia because apart from programming for pleasure they need statistical methodologies to produce their papers and often are not willing to implement everything from scratch.

Here in Brazil a programming language called Lua was produced. It is a great language that many Brazilians do not use. This linugagem was produced by Petrobras engineers. What I realize is that many use a language by what it has available and asymptotically the interest is built on these people to contribute by implementing.

I may be wrong, but perhaps incorporating mathematical methodologies from other areas will open doors for many workers to Julia language.

I worry about this since there are initiatives to reimplementation of the base R by other companies that already use the R language for their analysis of data such as Microsoft, Google, Amazom, Oracle, Facebook, etc.

Link: https://mran.microsoft.com/open/

To be more clear, making it possible to implement a useful R transpiler is extremely unlikely be be part of julia 1.0 or even 2.0, if ever. The reason is that our compiler is optimized for julia code. The semantic requirement of R and the R code that depends on these features are “bad code” by this standard since the semantics itself makes the code much harder to optimize statically. It is of course possible to optimize them to some degree with fancy JIT techniques but since those pattern will not appear from julia code and they are very hard to implement in general it’ll not be what we work on with any priority. For more information about the difference in the approach we are taking vs other scripting languages with a JIT added as an afterthought, see https://www.youtube.com/watch?v=cjzcYM9YhwA .

As for when it’s a good time to use julia, if you do not want to investigate any time in some level of internals (which is a perfectly valid choise) waiting for 1.0 is a reasonable choise. I do recommend starting now with some throw away scripts to get a feeling though. If you want ot use julia now, it’s almost certain that you’ll see some breackage before 1.0 releases though the magnetude of it strongly depends on your usecase (I’ve seen 1yr+ old code running just fine with minor depwarns and I’ve also seen a package broken multiple time in a single release cycle).

1 Like

3 posts were split to a new topic: Notes on the Julia computer (JIT vs static)

While such altruistic motives are laudable, in the long (or not so long) run they usually lead to abandonned packages. Write something that you would use.

If, for example, betareg is such a method, then specifically

  1. read the vignette,
  2. understand the method (probably requires reading some of the cited papers),
  3. understand the numerical challenges (if any),
  4. refer to the source code only to clarify some points.

Iterate 1-4 until you have tons of nicely running unit tests. Then profile and document your code.

Note that 1-3 will not require looking at any R sources. The focus is on understanding, not transcribing.

1 Like

It does not matter the license. Transcription refers to mathematical functionality. Inclined, in this transcription process, corrections or improvements may be made, including to adjust to the proper use of the Julia language. There will be no code simulation. What will be there is similarity of the mathematical functionalities approached by the package.

Sounds good. I didn’t want to discourage you, only highlight that licensing was an important consideration.

Actually the data representation framework (used in statistcal analysis) is one area where the julia API is not yet solid and stable. There has been attempts with various models for dataframes (i.e. tables where columns may be different types and missing values may occur) and there are lots of discussions here on discourse if you are interested in learning more of this discussion.

A good package for betaregression would probably import and build on the GLM.jl package, and would possibly use the formula framework being developed here: https://github.com/kleinschmidt/StreamModels.jl . If you open an issue on the GLM package I am sure the people there would be happy to help you get started. This would also help you make your functionality as compatible as possible with existing statistical analyses packages.

Good luck! I would definitely use BetaReg.jl when it’s implemented :slight_smile:

Since you are talking about writing your own packages, to implement some of the functionality that you use in R (which would be a great thing to have more of in Julia, done natively, in a Julian fashion), then I wouldn’t wait for v1.0.
I don’t expect there to be any more breaking syntax changes after v0.6 (although new syntax that is now invalid may be added before v1.0), so I think now might actually be a very good time to get started.

1 Like

Be careful with that. Expect breaking changes to happen years from now. Everything always updates. However, v1.0 is made for long-term compatibility, which means there’s much less change to worry about than the current situation. But there will be breaking changes going to v2.0, but likely not even close to what we see now since what’s happening now is essentially “now or never” changes.

1 Like

I expect that there will be many changes after v1.0, I just hope that they are not breaking syntax changes.
Of course, I’m used to languages like C and CachéObjectScript, that haven’t really had any breaking syntax changes in decades, even with major new syntax being added.
I think there will be more likely to be breaking changes in some of the APIs in the future, than with the syntax itself, although hopefully @compat will be able to handle any future syntax changes, if any should occur in v2.0.

I don’t think you can do that - I’ve been cautioned by lawyers to not even read GPLed source code, as then if I implement something similar in the future, it might be taken to be a derivative work.

This is what a Chinese firewall is for: have a friend read the code, and answer any questions you have.

1 Like

Not the best example. See DataTables or DataFrames?

I have more to say on this but will do so in the other thread.

Just beware that while it should neither be patented, it seems like it’s possible in practice. Arithmetic coding is math right, and patented (but I believe ran out; still even now not used by browsers, even with it as an option in JPEG).