New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here

Computer Science 1 Programming

by: Allie West II

Computer Science 1 Programming CSCI 1300

Allie West II

GPA 3.51


Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Class Notes
25 ?




Popular in Course

Popular in ComputerScienence

This 35 page Class Notes was uploaded by Allie West II on Thursday October 29, 2015. The Class Notes belongs to CSCI 1300 at University of Colorado at Boulder taught by Staff in Fall. Since its upload, it has received 9 views. For similar materials see /class/231994/csci-1300-university-of-colorado-at-boulder in ComputerScienence at University of Colorado at Boulder.

Popular in ComputerScienence


Reviews for Computer Science 1 Programming


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/29/15
CSCI 1300 Artificial Intelligence Lecture Mike Mozer December 4 2003 Computer Science Operating Systems Programming Languages Networking Security Theory Artificial Intelligence Artificial Intelligence Natural Language Understanding Speech Recognition Computer Vision Robotics Reasoning Planning Machine Learning Machine Learning Supervised Learning spam filters hotmailcom ALVINN autonomous vehicle navigation Unsupervised Learning collaborative filtering amazoncom fault monitoring Reinforcement Learning tdgammon champion backgammon playing program elevator controller adaptive home lightingheating control Reinforcement Learning A Simple Example Suppose you are in one of two states hungry sleepy Suppose you can take one of two actions go to Turley s lie on bed Reward contingencies hungry gt go to Turley s reward hungry gt lie on bed no reward sleepy gt go to Turley s no reward sleepy gt lie on bed reward Reward depends on what action you take in a given state Reinforcement Learning A Simple Example How do you learn to take the correct action Trial and error Through experience system can learn to predict the reward that will be obtained for some action given the current state rewardaction state This is also notated as Qstate action Given the expected reward agent can choose best action if Qhungry Turley s gt Qhungry lie on bed then go to Turley s else lie on bed Reinforcement Learning in the Real World time interval 1 2 3 4 5 6 7 state S1 52 S3 S4 55 56 57 action instantaneous reinforcement Issues Delayed reinforcement eg car accident due to worn tires Occasional reinforcement eg chess playing Short term versus long term rewards eg skipping class Exploration versus exploitation eg trying new restaurants Partially observable state eg viral infection Multiple agents eg multiple elevators Elevator Control In D s Tomeuky M c Molar m M E Haxxeimo m Advance m Neural aimmotion Pruczuwg System 3 MIT Pm Cnmhndge m 1391 Improving Elevator Performance Using Reinforcement Learning Robert H cm Andnw a L J Ind lKeD t Campnmsmm Depul m Univ My M nuhusnu mummy M Mulachmlu Amheul MA mans mo m MA am mo Ma Abstract Thin papquot dumb m mum olxeinfoxcemznlleuning m m the dif cuh m mm mum of mm dupatching The a vnlo domain pain a combhm an at a engn m mu m mall in rueth in due 31mm xylkm op m in continue 5 m i m m in comm lint is damn event dyn mic xyslemx Th 139 mu m m My nhnxvahk and my in nonunion due m changing pauengu urival min 1 admin m m l Lem w hue and dzmon mm Ike power of RL on a very lug cal mama dynamic np mixn nn problem of pucck utility Elevator Control Table 3 Results for DowniPeak Pro le with Up and Down Tra quotic Table 4 shows iiie results or the downrpeak mime pro le with up and down txa ic including an average 014 up passengers per minute ei iiie lobby This iime them is twice as much up traf c and the RL egenis generalize extremely well to em new Table 4 Results for DowrrPeak Pro le with Twice as Much Up ha ic Q learning Watkins 1989 Watkins amp Dayan 1992 Qxu If action u is taken in state x what is the minimum cost we can expect to obtain Policy based on Q values exploration rate argmianxt ut with probability 1 9 75xt random with probability 9 Incremental update rule for Q values Qxt ut lt 1 ocQxt ut 0c max 0t kQxt1 u learning rate L discount factor Given fully observable state infinite exploration etc guaranteed to converge on optimal policy The Adaptive House Michael Mozer Robert Dodiert Debra Miller Marc Anderson Josh Anderson f Dan BertiniFr Matt Bronder Michael Colagrossof Robert Cruickshankquot Diane Lukianow f Tom Moye Charles Myers Tom Pennellf James Ries Brian Daugherty Erik Skorpen Mark Fontenot Joel Sloss Okechukwu lkeako Lucky Vidmar L Paul Kooros Matthew Weeks University of Colorado Department of Computer Science h Institute of Cognitive Science Department of Civil Environmental and Architectural Engineering 1quotDepartment of Electrical and Computer Engineering 39Department of Mechanical Engineering Department of Aerospace Engineering httpwwwcscoloradoedumozeradaptivehouse The adaptive house Not a programmable house but a house that programs itself House adapts to the lifestyle of the inhabitants House monitors environmental state and senses actions of inhabitant House learns inhabitants schedules preferences and occupancy patterns House uses this information to achieve two objectives 1 anticipate inhabitant needs 2 conserve energy Domain home comfort systems air heating lighting water heating ventilation Th adap v hm Edn in wgha g Hrra g M f Hdcgur m f EM 311 Great room Y LT 12 Lrl Lrvrtt pr 7 hr vrrk Fer 1 in LFIAvEVLy dk m m3 bth wmg 3mmng f 2 z 1 EEEEEFrm Sensors Water heater Furnace Controls quotl Computers Training signals Actions performed by inhabitant specify setpoints anticipation of inhabitant desires Gas and electricity costs energy conservation An reinforcement learning framework Each constraint has an associated cost discomfort cost if inhabitant preferences are neglected energy cost depends on device and intensity setting The optimal control policy minimizes t0K 1 JtEm dx eu 0 mm 2 p t where t index over nonoverlapping time intervals to current time interval ut control decision for interval t xt environmental state during interval t ACHE Adaptive Qontrol of ome Environments Separate control system for each task air temperature regulation furnace space heaters inhabitant actions kin and energy costs gt ACHE lighting regulation wall sconces overhead lights environ mental ll state water temperature regulation we Iquot he 2 339 device setpoints General architecture of ACHE decision setpoint pro le future state Information state occupied representation zones instantaneous enVIronmental state Lighting control What makes lighting control a challenge Twentytwo banks of lights each with 16 intensity levels seven banks of lights in great room alone Motiontriggered lighting does not work Lighting moods Two constraints must be satisfied simultaneously maintaining lighting according to inhabitant preferences conserving energy Range of time scales involved Sluggishness of system Resolving the sluggishness dilemma Anticipator Neural network that predicts which zones will become occupied in the next two seconds Input 1 3 and 6 second average of motion signals instantaneous and 2 second average of door status instantaneous 1 second and 3 second average of sound level current zone occupancy status and durations time of day Output pzonei becomes occupied in next 2 seconds I currently unoccupied Runs every 250 ms 36 2 8 Training anticipator Occupancy model provides training signal Two types of errors miss statet 2000 ms statet 1750 ms zone I becomes occupied statet 250 ms false alarm statet gt zone i vacant Training procedure Given partially trained net collect misses and false alarms Retrain net when 200 additional examples collected TD algorithm for misses hitmissfa O 20000 40000 60000 Number of training examples Examples of anticipator performance at El 39 9 lilil 42139 m I Er ea Hm mm m 39U El Emirate Brim 111 I g39 I 5 ath 5 I Lighting controller costs Energy cost 72 cents per kWhr Discomfort cost 1 cent per device whose level is manually adjusted Anticipator miss cost 1 cent per device that was off and should have been on Anticipator false alarm cost 1 cent per device that was turned on Results about three months of data collection events logged only from 1900 0659 04 energy 2000 4000 6000 8000 events Air temperature control Sunday March 6 2000 on G O U C E Off I I I I O 5 1O 15 20 I I I I I A O 5 1O 15 20 a 1 I I I I 8 U C 005 G E W U V O I I I Q 0 5 1O 15 20 Time of day 82 00 00 Mean Cost day J Comparison of control policies using artificial occupancy data Productivity Loss 10 hr constant temperature setback thermostat on 4s occupancy Mean Cost day O triggered Neurothermostat 75 7 0 25 0 5 075 1 Variability Index 95 01 00 Productivity Loss 30 hr setback thermostat occupancy triggered constant temperature Neurothermostat 39 0 025 05 075 1 Variability Index Comparison of control policies using real occupancy data Mean Daily Cost productivity loss p1 p3 Neurothermostat 677 705 constant temperature 785 785 occupancy triggered 749 866 setback thermostat 812 974


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Janice Dongeun University of Washington

"I used the money I made selling my notes & study guides to pay for spring break in Olympia, Washington...which was Sweet!"

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

Parker Thompson 500 Startups

"It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.