Including the external python library NetworkX for graphs

Discussion about the project in general, organization, website, or any other details that aren't directly about the game.
Post Reply
Message
Author
Morlic
AI Contributor
Posts: 296
Joined: Tue Feb 17, 2015 11:54 am

Including the external python library NetworkX for graphs

#1 Post by Morlic »

I would like to suggest to introduce the NetworkX library (https://networkx.github.io/) in FreeOrion to handle graphs on the python side (mainly the AI).

Relying on boost-graph is not an option because it breaks the currently pure python implementation of the AI logic with all its advantages.
Self-implementing all the relevant standard algorithms is unreasonable if we can simply use an external library.

Key points of NetworkX:
  • Open Source under BSD-license (https://networkx.github.io/documentatio ... legal.html). Seems compatible with our license but someone with more knowledge of that stuff should confirm.
  • Pure Python implementation
  • Cross-platform compatible
  • There are versions compatible with Python 3.2 and higher (in case we ever upgrade our Python version)
Being a pure python implementation, it is obviously slower than C++ implementations with Python wrapper (e.g. http://igraph.org/python/). Being pure python, however, makes the inclusion of NetworkX into the project as well as modifying/extending the algorithms rather trivial.
If performance eventually does become an issue for certain algorithms, we can reimplement those using the boost-graph library and our python interface. Prototyping, testing and tuning, however, would still be possible in pure Python.


Any thoughts and discussion on this from the project lead and the other devs are welcome.
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Dilvish
AI Lead and Programmer Emeritus
Posts: 4768
Joined: Sat Sep 22, 2012 6:25 pm

Re: Including the external python library NetworkX for graph

#2 Post by Dilvish »

Morlic wrote:Relying on boost-graph is not an option because it breaks the currently pure python implementation of the AI logic with all its advantages.
I think that "not an option" is rather overstrong; it seems to at least be an option for us to simply expose a portion of boost-graph via the python interface (and perhaps build some related helper functions), which you yourself appear to acknowledge as a potential fallback in case the pure python implementation wound up too slow.

From the reference to inclusion of NetworkX being trivial, it sounds like you are proposing that we just include the whole NetworkX library as part of our installation rather than requiring players to separately download it. As we have seen, relying on boost is not entirely hassle-free, but adding an extra library source seems like it would create at least a bit of an additional maintenance issue. On the license front, it tentatively looks to me like they are using the original BSD license, which seems to have murky compatibility with GPL.

One possibility that you might have contemplated but which I don't see clearly expressed, could be that we use NetworkX to explore and prototype some graph algorithms for AI use, but use boost-graph for our actual production implementation.
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

Morlic
AI Contributor
Posts: 296
Joined: Tue Feb 17, 2015 11:54 am

Re: Including the external python library NetworkX for graph

#3 Post by Morlic »

I think that "not an option" is rather overstrong; it seems to at least be an option for us to simply expose a portion of boost-graph via the python interface (and perhaps build some related helper functions), which you yourself appear to acknowledge as a potential fallback in case the pure python implementation wound up too slow.

...

One possibility that you might have contemplated but which I don't see clearly expressed, could be that we use NetworkX to explore and prototype some graph algorithms for AI use, but use boost-graph for our actual production implementation.
There is a strict difference between writing wrapper functions for some selected and established functions/algorithms and actually designing and testing new algorithms using the boost library. I have absolutely no objection to use the boost library where a suitable interface exists. I have no objection to incrementally build such an interface and reduce the NetworkX usage.

I do have a problem with the significantly reduced development speed until such an interface is built and tested. If I overestimate that effort and you think an extensive and flexible interface which allows us to work in python only is relatively quick to introduce and is not a maintenance hassle, then by all means I am in favor of the boost-graph solution. As I am certainly not the man to provide such an interface, I propose to use an external library where possible.

From the reference to inclusion of NetworkX being trivial, it sounds like you are proposing that we just include the whole NetworkX library as part of our installation rather than requiring players to separately download it.
Yes. As the library is pure python source code and should be platform independent, it seems the simplest to distribute it simply like our own Python files. Add the networkx directory to our python path and be done with it. The uncompressed size is something like 2.5MB which seems reasonable.
On the license front, it tentatively looks to me like they are using the original BSD license, which seems to have murky compatibility with GPL.
If I am not mistaken, the original BSD license is problematic because of the advertisement clause which is not a part of the NetworkX license (looks like the 3-clause, i.e. modified, BSD to me).

As we have seen, relying on boost is not entirely hassle-free, but adding an extra library source seems like it would create at least a bit of an additional maintenance issue.
The only maintenance issue that should occur is if we change our supplied python version. I am not completely sure if the 2.x version is compatible with 3.x python (it does use the six-library which suggests compatibility) but at the very least, 3.x versions of NetworkX exist. As far as I know, no plans currently exist to release come Python3-incompatible Python4. But given the time scope of FreeOrion, we may certainly run into that at some point...
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Dilvish
AI Lead and Programmer Emeritus
Posts: 4768
Joined: Sat Sep 22, 2012 6:25 pm

Re: Including the external python library NetworkX for graph

#4 Post by Dilvish »

On the license issue, you are right, on looking more into the issue the license does look like it is the GPL compatible modified BSD license.

On the maintenance issue, I disagree that the only maintenance issue is if we change python versions. It appears to me that NetworkX still has bugfixes being periodically made, and so one of us would need to be keeping track of it to see if we need to upgrade our version periodically (and we may get into situations like with boost where we make local patches/workarounds and then have to track how long they are needed). That is not necessarily a dealbreaker kind of issue, but it seems to me that it needs to be acknowledged. There is also an issue regarding optional packageswhich I'll discuss more below. Are you willing to take on the responsibility for keeping track of such things?
I have absolutely no objection to use the boost library where a suitable interface exists. I have no objection to incrementally build such an interface and reduce the NetworkX usage. I do have a problem with the significantly reduced development speed until such an interface is built and tested. If I overestimate that effort and you think an extensive and flexible interface which allows us to work in python only is relatively quick to introduce and is not a maintenance hassle, then by all means I am in favor of the boost-graph solution.
I was not proposing preemptively exposing a large portion of boost-graph; my thought was to mostly rely on helper functions written in C++, like we currently do for some of our pathing code, and just expose additional items (helper functions or things directly from boost-graph) as it seemed suitable. It's not clear to me what your objection was to the idea of exploring and prototyping algorithms using NetworkX with the expectation of doing the "production" implementation with boost-graph, particularly given your statement that you "have no objection to incrementally build such an interface and reduce the NetworkX usage" -- is it that you are concerned about the possibility of putting significant time into implementing something in NetworkX but then risk there be some long indefinite delay before it was adapted for boost-graph? Unless Geoff or Marcel has some significant objection, I am OK with the idea that if NetworkX proves to be plenty speedy (** but see below**) we could just rely on including it, or if the speed is marginal then perhaps fall back to including NetworkX if we are not able to fairly promptly re-implement something using boost-graph. But if a NetworkX implementation of something wound up slowing down the AI too much then we would not roll it out in that form even if we did not have a ready reimplementation with boost-graph.

** re the speed of NetworkX-- it appears to me that NetworkX will rely on the optional Numpy and Scipy packages to speed up various calculations if those packages are present. I know I already have those installed on my machines and I have the impression you likely do as well, but we can't rely on our normal players having those, so any assessments of the time requirements for any of these graph algorithms would have to be done without reliance on those packages.

Marcel, Geoff, I'd appreciate it if you'd both state here whether or not you have any concerns about this proposal.
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13587
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Including the external python library NetworkX for graph

#5 Post by Geoff the Medio »

Dilvish wrote:Marcel, Geoff, I'd appreciate it if you'd both state here whether or not you have any concerns about this proposal.
Anyone can do almost whatever they want with FreeOrion and non GPL-code on their own machine, so prototype all you want with whatever. Just don't commit it to master. Expose whatever Boost stuff is useful through the Python API. Prototyping with one graph library that you plan to replace seems odd, as how they work is probably inconsistent to a degree that makes such a plan unlikely to be of much use vs. just implementing it in what you'll eventually use.

User avatar
Dilvish
AI Lead and Programmer Emeritus
Posts: 4768
Joined: Sat Sep 22, 2012 6:25 pm

Re: Including the external python library NetworkX for graph

#6 Post by Dilvish »

Geoff the Medio wrote:Prototyping with one graph library that you plan to replace seems odd, as how they work is probably inconsistent to a degree that makes such a plan unlikely to be of much use vs. just implementing it in what you'll eventually use.
A lot of graph algorithms are fairly standard, so I had thought there might be some value to prototyping in NetworkX even if we plan to reimplement with boost-graph. But if you are correct then we'd very likely want to just work with NetworkX (so long as its speed proves reasonable). Given that our graphs are likely to be relatively modest sized I am hopeful that NetworkX might prove to be fast enough for us. If that's the case, would you have an objection to our using it (in master)?
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 13587
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Including the external python library NetworkX for graph

#7 Post by Geoff the Medio »

Dilvish wrote:...I am hopeful that NetworkX might prove to be fast enough for us. If that's the case, would you have an objection to our using it (in master)?
If it's not GPL compatible, yes.

User avatar
Dilvish
AI Lead and Programmer Emeritus
Posts: 4768
Joined: Sat Sep 22, 2012 6:25 pm

Re: Including the external python library NetworkX for graph

#8 Post by Dilvish »

Geoff the Medio wrote:
Dilvish wrote:...I am hopeful that NetworkX might prove to be fast enough for us. If that's the case, would you have an objection to our using it (in master)?
If it's not GPL compatible, yes.
I was wrong in my initial thinking that NetworkX used the original (GPL murky) BSD license, it's clear to me now that it uses the 3-clause modified BSD license, which is GPL compatible.
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

User avatar
Vezzra
Release Manager, Design
Posts: 6095
Joined: Wed Nov 16, 2011 12:56 pm
Location: Sol III

Re: Including the external python library NetworkX for graph

#9 Post by Vezzra »

Geoff the Medio wrote:Just don't commit it to master.
By "it" are you referring to stuff that uses the library or the library itself? Because unless I'm missing something obvious, the library itself does not need to be integrated into our repo, just included into the SDK and the installer packages (like other dependencies). On Linux it would need to be pulled in as a dependency (assuming most of the "big" distros provide it via their repos, if not, that would be a deal breaker IMO).

Personally I have no objections to include that library. As long as the license is compatible (and it seems it is), it's just an additional dependency. We already use several libraries (Boost, Ogg, SDL, etc.), they all just happen to be for the C++ backend until now. I don't see a problem with using external/third party libraries on the Python side as well.

User avatar
Dilvish
AI Lead and Programmer Emeritus
Posts: 4768
Joined: Sat Sep 22, 2012 6:25 pm

Re: Including the external python library NetworkX for graph

#10 Post by Dilvish »

Vezzra wrote:Because unless I'm missing something obvious, the library itself does not need to be integrated into our repo, just included into the SDK and the installer packages (like other dependencies). On Linux it would need to be pulled in as a dependency (assuming most of the "big" distros provide it via their repos, if not, that would be a deal breaker IMO).
It had not crossed my mind that for Windows and MacOSX that we could just include it in the SDK and installer package, that does seem to make the issue even simpler than I had been thinking. In Ubuntu I was able to install it no problem via the command line

Code: Select all

sudo pip install networkx
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

Morlic
AI Contributor
Posts: 296
Joined: Tue Feb 17, 2015 11:54 am

Re: Including the external python library NetworkX for graph

#11 Post by Morlic »

Dilvish wrote:Unless Geoff or Marcel has some significant objection, I am OK with the idea that if NetworkX proves to be plenty speedy (** but see below**) we could just rely on including it, or if the speed is marginal then perhaps fall back to including NetworkX if we are not able to fairly promptly re-implement something using boost-graph. But if a NetworkX implementation of something wound up slowing down the AI too much then we would not roll it out in that form even if we did not have a ready reimplementation with boost-graph.

** re the speed of NetworkX-- it appears to me that NetworkX will rely on the optional Numpy and Scipy packages to speed up various calculations if those packages are present. I know I already have those installed on my machines and I have the impression you likely do as well, but we can't rely on our normal players having those, so any assessments of the time requirements for any of these graph algorithms would have to be done without reliance on those packages.
I have those packages installed but they are not within the Freeorion python path, so that should probably not cause trouble. There are a few algorithms that absolutely require numpy/scipy (i.e. have no alternative implementation) but they seem unlikely to use in AI context.

On the maintenance issue, I disagree that the only maintenance issue is if we change python versions. It appears to me that NetworkX still has bugfixes being periodically made, and so one of us would need to be keeping track of it to see if we need to upgrade our version periodically (and we may get into situations like with boost where we make local patches/workarounds and then have to track how long they are needed). That is not necessarily a dealbreaker kind of issue, but it seems to me that it needs to be acknowledged. There is also an issue regarding optional packageswhich I'll discuss more below. Are you willing to take on the responsibility for keeping track of such things?
Right. I would be willing to keep track of such things.

I was not proposing preemptively exposing a large portion of boost-graph; my thought was to mostly rely on helper functions written in C++, like we currently do for some of our pathing code, and just expose additional items (helper functions or things directly from boost-graph) as it seemed suitable. It's not clear to me what your objection was to the idea of exploring and prototyping algorithms using NetworkX with the expectation of doing the "production" implementation with boost-graph ...
It is duplicate work which is unnecessary if the NetworkX implementation is fast enough. It makes development slower (both because it is duplicate work and because it is heavy templated C++ compared to Python) and yes, also adds possibly long delays after an implemenation which is ready for play-testing exists. If play-testing feedback is negative, it is once again easier/faster to do it in Python.

As far as initial work is concerned, we would at the very least need a solid interface to create and modify (i.e. add/delete edges/vertices, redefine weights etc) as well as iterate over a boost-graph representing object, preferably conversion functions to the NetworkX data structure as once a "production" implementation needs to be extended/modified, again working in Python is strongly prefered.

That being said, I am open for trying it first. So I would work with my local NetworkX installation, post a PR with that implementation, get play-testing feedback from you / possibly other devs that set up NetworkX locally and then once the algorithm is confirmed/polished, try to see how long the boost adaptation takes (and if it is necessary given the performance of the algorithm).
If I provided any code, scripts or other content here, it's released under GPL 2.0 and CC-BY-SA 3.0

Post Reply