Hacker News new | comments | show | ask | jobs | submitlogin
Spinning Up in Deep RL (blog.openai.com)
167 points by stablemap 5 days ago | hide | past | web | 17 comments | favorite





This developed-and-maintained package is a good approach towards furthering RL development; as the writeups state, the biggest problem in RL is subtle bugs from an implementation which don't cause an error but tank learning performance. (+ loggers/utils to help debug things)

Granted, a lot of RL thought pieces/examples on places like Medium.com take an existing RL implementation without many tweaks, run it on a new task, and see what happens. A better RL library might make this workflow more prevalent; hence why it's very important for researchers to make their pipelines transparent.


I've made some effort to provide a set of similar high-quality implementations available in PyTorch: https://blog.millionintegrals.com/vel-pytorch-meets-baseline...

In my opinion PyTorch code is easier to understand and debug for newcomers. Code is definitely lacking in documentation, but whenever there was a tradeoff between clarity and modularity in the end I've chosen modularity. Ideally I would like others to be able to take bits and pieces and incorporate into their projects to speed up time to delivery of their ideas.


+1 on that, that's a great project.

PyTorch with its explicit state that can be easily examined by hand in PyCharm debugger will be way easier for people coming into the field.


This is awesome and I hope will allow more people to experiment with algorithms, instead of only re-applying OpenAI's baselines. Baselines are great, but are very hard (for me, at least) to tinker with.

It helps me to understand something new if I can controllably break it. In other words, I progress by predicting the edge-conditions when something shouldn't work - and then testing if algorithm indeed experienced expected type of failure. Transparent algorithm implementation is key for this.

One thing, which I immediately checked in the spinningup-repo is if it uses TF Eager. And it doesn't. @OpenAI what's your reasoning for that?


Hi! Primary developer for Spinning Up here. The code for this was developed mostly in June and July this year, and Eager still felt relatively new to me. I wanted to wait for Eager to stabilize and hit maturity before investing in it. I also wanted to see how TF would change on the road to TF 2.0, since that could change the picture even more.

At the six month review in 2019, we'll evaluate whether it makes sense to rewrite the implementation examples for TF 2.0. I'll speculate that the answer will be "yes, it does." Since Eager execution will be a central feature of TF 2.0, the (probable) revamp for Spinning Up will include it.

Good luck with your experiments! And please let us know about your experience with Spinning Up---we want to make sure it fulfills the mission of helping anyone learn about deep RL, and user feedback is vital for that.


Thank you for sharing your thought process!

whenever I see high quality submissions I bookmark it and promise myself to come back and spend time learning it.

this time...I promise myself its different


Hi! I really appreciate you sharing this with the community. The documents and code look really clear and concise. I do have one question. Is it possible to change the dependency on Mujoco engine to something else (to for e.g. Roboschool)?

I don't have access to a computer with GPU and I am currently using google colab to do my DL projects. I tried installing Mujoco on colab but unfortunately, the computer id generated seems invalid. Any help is highly appreciated.

Thank you!


Hi!

A few people had this question on Twitter also. Our response: "Several of us at OpenAI are thinking seriously about how to make something like this happen! I can't promise anything, but we definitely want to remove barriers to entry." (https://twitter.com/jachiam0/status/1060595172285632512)

In the meanwhile, you can still use Spinning Up with the Classic Control and Box2d envs in Gym (which don't require any licenses at all). And what's more: for most of these environments you don't need a GPU! CPU is fine.


Thank you for replying promptly. I am willing to help with such a change, if planned. Meanwhile I'll get started with running spinning-up on my laptop.

This looks like an awesome initiative! I think it will be very valuable for people trying to enter the field. I particularly like the clear advice on how to get started doing RL research. Have you considered setting up a forum for the community to share their experiences?

Discovered two small issues in the doc. Where can I send feedback?

Let us know by opening an issue on Github: https://github.com/openai/spinningup/issues/new

Perfect. Thanks

is there a Dockerfile with everything set up already?

No, but please open up an issue on Github and we'll look into making one! :)

https://github.com/openai/spinningup/issues/new


I thought this was something about roguelikes. :(



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: