Working through Dynamic Programming

So i’ve been studying dynamic programming and its applications in software engineering. The idea of being able to pick out the “best” choice even among the simplest state machines is incredibly useful, as I often find it difficult to find the best cost-benefit ratio of things by hand, for example. The first place I went was here. It seemed like a good starting point on understanding what dynamic programming is, exactly. The example given, where a power plant manager has to evaluate a series of proposals to improve 3 plants and choose from them, given their costs and benefits, was an easy enough problem where you can see the solution on your own, but complicated enough to work out the problem myself through code and have it be a fruitful exercise.
So, I fired up Eclipse, and went to work on designing and writing code that would not only solve the problem , but would also be extensible enough to solve problems like it, or make it easy to change a few parameters up to see what happens. Here’s what I came up with:

import java.util.Hashtable;

public class DynamicMain {
	
	public static void main(String[] args){
		
		/*create and populate the 3 "plants" with relevant data.
		the parameters for creating a State Analyzer is the hashtable of options available, the location in the gamestate array that
		this edits, and the child of this gamestate.*/
		Hashtable ht3= new Hashtable();
		ht3.put(0, 0);
		ht3.put(1, 4);
		StateAnalyzer State3= new StateAnalyzer(ht3,2, null);
		
		Hashtable ht2= new Hashtable();
		ht2.put(0, 0);
		ht2.put(2, 8);
		ht2.put(3, 9);
		ht2.put(4, 12);
		StateAnalyzer State2= new StateAnalyzer(ht2,1, State3);
		
		Hashtable ht1= new Hashtable();
		ht1.put(0, 0);
		ht1.put(1, 5);
		ht1.put(2, 6);
		StateAnalyzer State1= new StateAnalyzer(ht1,0, State2);
		
		//find the "right" combination by evaluating a starting gamestate on the the plant we are beginning with.
		int[] arrayToUse= {-1,-1,-1};
		GameState result= State1.evaluateChoices(new GameState(arrayToUse,0,5));
		
		//print out the "correct" choices.
		for(int i=0; i<result.status.length; i++){
			System.out.println(result.status[i]);
		}	
	}
}

Here’s what my “main” method looks like. I put in all the data given by the example I followed, in a way that was interpretable. My StateAnalyzer objects represent the different plants that have options to upgrade, and the StateAnalyzer passed in as a constructor to some of them represents the StateAnalyzer to pass their states into ( a null means that there is no child, and to just send your choice and be done with it). This is also where I print out the optimal choice found by this series of plants. My StateAnalyzer class looks like this:

import java.util.Enumeration;
import java.util.Hashtable;

public class StateAnalyzer {
	
	//the array of choices available to this plant
	Hashtable choices;
	
	//the child to send our data to, if we had one
	StateAnalyzer child;
	
	//which element in the gamestate we are populating with our choices.
	int location;
	
	public StateAnalyzer(Hashtable ht, int loc, StateAnalyzer stateChild){
		choices= ht;
		location = loc;
		child=stateChild;
	}

public GameState evaluateChoices(GameState previousState){

//initially, the default state we pass back is just the one that was sent to us
GameState bestState= (GameState) previousState.clone();
Enumeration e = choices.keys();
//we will go through all possible choices to find which ones are valid.
while(e.hasMoreElements()){

int nextInt= (Integer)e.nextElement();
//if picking this choice is a valid move...
if(nextInt<=previousState.spaceLeft){

//update the current gamestate with this choice
GameState currentState= (GameState) previousState.clone();
currentState.status[location]= nextInt;
currentState.score+= (Integer)choices.get(nextInt);
currentState.spaceLeft-=nextInt;

//if we have a child, we then ask it to give us it's best choices based on the data we've given it.
if(child!= null){
GameState newState= child.evaluateChoices(currentState);
currentState= newState;
}
//if the state we evaluated is better than the current best one, set this new one to be the one we send back.
if(currentState.score>=bestState.score){
bestState=currentState;
}
}
}
return bestState;
}
}

The “meat” of this code is the evaluateChoices method. Essentially what happens here is that we iterate through all the possible options available to this current plant, skip over the ones that would result in us being over budget, and for the ones that aren’t, basically pull a “max” function on the results of all the choices available to us. The GameState class itself is just an object that holds the choices of the given plants, the current “score” (in this case, the total results of all the money spent currently), and the money left. This way, we can keep track of all the data available to us, and pass this data to the child plants without having to keep track of additional data in a messy way. For posterity, my GameState class looks like this:

public class GameState implements Cloneable{
//the current choices made
public int[]status;

//the current value of this gamestate
public int score;

//how much space there is left in this state.
public int spaceLeft;
public GameState(int[] arrayStatus, int curScore, int space){
status= arrayStatus;
score=curScore;
spaceLeft= space;
}

public Object clone(){
return new GameState(status.clone(),score,spaceLeft);
}
}

So there you have it! If you put all of this code in Eclipse, it should give you 1,3,1 as your output, which represents the option which cost 1 million for plant 1, 3 million for plant 2, and 1 million for plant 1, which is one of the two optimal solutions (the other being 2,2,1). Feel free to use the code and mess around with it if you like. You can also switch around the “order” of the 3 plants, and it will still come up with the same answer, so long as you evaluate the plant at the top of the chain. One thing that might be worth changing is the fact that it re-evaluates choices even if the data is similar except for the previous moves. If the “best” results of the given amount of money left were saved and duplicated instead of run through all over again, it would make the program run noticeably faster, especially if the amount of total plants are increased.
One thing I really like about dynamic programming is the idea of the “principle of optimality”, in that each current state(or plant, in this case) doesn’t have any knowledge about what plans the other plants have at all. All the plant does is find the optimal choice given how much money it has, the value these plans give, and the assumption that whatever money we pass on to the next plant will be spent optimally. Therefore, each plant does what is optimal, and for each state given to it, the plants will return the optimal solution given to it, given the fact that each other plant down the line will be choosing what is optimal as well. And if you give the beginning plant the state with no value and all money, it will find the most optimal way to spend that money among all plants, all thanks to the principle of optimality.
I like applying the idea of the principle of optimality to life, too. Whatever state your life is in right now, all you can do is play your cards as best as you can, give life your all, and try as hard as you can to do as well as you can. So if things don’t end up the way you might have wanted them to, at least you could reason that things ended up the best that they could have, thanks to the principle of optimality :P.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s