Julia is a multi-paradigm, primarily functional programming language that was created for machine-learning and statistical programming. Python is another multi-paradigm programming language that is used for machine-learning, though generally Python is considered to be object-oriented. Julia, on the other hand, is more based on the functional paradigm. Though Julia certainly isn’t as popular as Python, there are some huge benefits to using Julia for Data Science that make it a better choice in a lot of situations that Python.
It’s hard to talk about Julia without talking about speed. Julia prides itself on being very fast. Julia, unlike Python which is interpreted, is a compiled language that is primarily written in its own base. However, unlike other compiled languages like C, Julia is compiled at run-time, whereas traditional languages are compiled prior to execution. Julia, especially when written well, can be as fast and sometimes even faster than C. Julia uses the Just In Time (JIT) compiler and compiles incredibly fast, though it compiles more like an interpreted language than a traditional low-level compiled language like C, or Fortran.
You might have noticed that I said Python was versatile as an advantage to Julia, and this is true — there are a lot of things that can be done with Python that you just can’t do with Julia. Of course, this is only natively speaking, because the versatility we’re talking about now is versatility in language. Julia code is universally executable in R, Latex, Python, and C. This means that typical Data Science projects have the potential to be written once, and compiled in Julia natively from another language in a wrapper, or just by sending strings.
PyCall and RCall are also pretty big deals. Given that a serious downside to Julia is in fact the packages, it’s really convenient to be able to call on Python and R whenever you need them. PyCall is very well implemented into Julia, and is definitely sincerely well-done, and very usable.
Julia is a very uniquely typed language and has its own quirks and features, but among one of the coolest features is Julia’s multiple dispatch. First and foremost, Julia’s multiple dispatch is fast. On top of that, using Julia’s polymorphic dispatch allows for applying function definitions as properties of a struct. This, of course, makes inheritance viable inside of Julia.
Not only that, but using Julia’s multiple dispatch makes a function extendable. This is a great benefit for package extensions, as whenever a method is explicitly imported, it can be changed by a user. It would be easy to explicitly import your method and extend it to route structs to a new function.
Unlike Python, Julia was made with the intention of being used in statistics and machine-learning. Python was created in the early 90s as an easy object-oriented language, though it has changed a lot since then. Given Python’s history, and the wide variety of uses for Python since it’s so popular, using a language that was made specifically for high-level statistical work could show a lot of benefits.
One way I see this benefiting Julia over Python is in linear algebra. Vanilla Python can chug through linear algebra, but vanilla Julia can fly through linear algebra. This is because of course Python was never meant to support all of the matrices and equations that go along with machine-learning. By no means at all is Python bad, especially with NumPy, but in terms of a no-package experience, Julia feels a lot more catered towards these sorts of mathematics. Julia’s operand system is a lot closer to that of R than Python’s, and that’s a big benefit. Most linear algebra is quicker and easier to do. Let’s show a dot-product equation, just to illustrate this further:
Python -> y = np.dot(array1,array2) R -> y <- array1 * array2 Julia -> y = array1 .* array2
I’ll be the first to say it, Julia’s Pkg package manager is an entire world above Python’s Pip package manager. Pkg comes loaded with its own REPL and Julia package from which you can build, add, remove, and instantiate packages. This is especially convenient because of Pkg’s tie-in with Git. Updating is easy, adding packages is always easy, and overall Pkg is a pleasure to use over Python’s Pip any day.
It doesn’t really matter which language you use, be it R, Julia, Python, or Scala. It is important to note, however that every language has its downsides, and no language is ever going to be the “ perfect language.” This is especially true if you are versatile in your programming, from machine-learning to GUIs to APIs. With that being said, Julia is certainly one of my favorites in my arsenal, as well as Python. Python has better packages, and with that typically if the project is small enough, I’ll veer towards Python, but for data-sets with millions of observations, it can be hard to even get that kind of data read in Python.
Overall, I look forward into the future of Julia. Julia’s a lot of fun to write, and will likely become even more viable for Data Science in the future.