2. Applied

The hit new social media site FOL-owo Me involves users following users' posts.

This social media site can be represented as the tuple (S,F) where S denotes the

set of all the users in the site and F denotes the set of ordered pairs for who follows

who. Meaning we can think of S as a unary predicate and F as a binary predicate.

For example, (Mac, Excellent Elf) = F means Mac follows Excellent Elf.

In FOL-owo Me, the follow restriction holds: Vavy(Fry → (Sx ^ Sy)).

Users can follow as many users (including themselves) as they wish.

(a) Draw out a sample instance of FOL-owo Me with at least 4 users and at

least 5 ordered pairs in F. Give the users family-friendly names.

(b) Assume the worst-case scenario for the servers - Everyone is following everyone

(including themselves!). Prove by mathematical induction on the amount of

users S, that |F|≤|S|².

Hint: Think of the smallest number of users in the system, and think of how

many more follows are added when we add a new user following everyone.

Fig: 1

Most Viewed Questions Of Artificial Intelligence

2.3) Which of the following statements is/are true? For sigmoid activation function, as input to the function becomes larger and positive, the activation value approaches 1. Similarly, as it becomes very negative, the activation value approaches 0. For training MLPs, we do not use activation function as a step function, as in the flat regions there is no slope and hence no gradient to learn. For tanh activation function, as the value increases along the positive horizontal axis, the activation value tends to 1 and along the negative horizontal axis, it tends towards -1, with the centre value at 0.5. The ReLU activation function is a piecewise linear function with a flat region for negative input values and a linear region for positive input values.

Verified Answer

3.4) Which of the following statements is/are true? Learning using optimizers such as AdaGrad or RMSProp is adaptive as the learning rate is changed based on a pre-defined schedule after every epoch. Since different dimension has different impact, adapting the learning rate per parameter could lead to good convergence solution. Since AdaGrad technique, adapts the learning rate based on the gradients, it could converge faster but it also suffers from an issue with the scaling of the learning rate, which impacts the learning process and could lead to sub-optimal solution. RMSProp technique is similar to the AdaGrad technique but it scales the learning rate using exponentially decaying average of squared gradients. 0 Adam optimizer always converge at better solution than the stochastic gradient descent optimizer.

Verified Answer

2.2) Which of the following statement(s) is/are true? By adding one or more layers to perceptron network with activation functions, non- linear separable cases can be handled. For a non-linearly separable dataset, if activation function applied on the hidden layer(s) of the Multi-layer Perceptron network is a linear function, the model will converge to an optimal solution. For a linearly separable dataset, applying non-linear activation function such as sigmoid or tanh on hidden layers of a MLP network, can converge to a good solution. All of the above

Verified Answer

2.7 Which of the following statements is/are true? At the output layer of a binary classification problem, we could either use sigmoid activation function with single output neuron or softmax function with two neurons, to produce similar results. For a classification application, that predicts whether a task is personal or official and which also predicts whether it is high or low priority, we could use two output neuron and apply sigmoid activation function on each output neuron. For a classification application, that predicts whether Team A would win, lose or draw the match, we could use three output neurons with softmax activation function applied at the output layer.

Verified Answer

2.1) Which of the following statement(s) is/are true? Perceptron network with just input and output layer converges for linearly separable cases only. Logistic regression model and perceptron network are the same model. Unlike Perceptron network, logistic regression model can converge for non-linear decision boundary cases. 0 By changing the activation function to logistic and by using gradient descent algorithm, Perceptron network can be made to address non-linear decision boundary cases.

Verified Answer

2.8 Which of the following points hold true for gradient descent? It is an iterative algorithm and every step it finds the gradient of the cost function with respect to the parameters to minimize the cost. For a pure convex function or a regular bowl-shaped cost function, it can converge to the optimal solution irrespective of the learning rate Since cost functions can be of any shape, it can only achieve local minima but not global minimum, provided the cost function is not convex. Normalizing the variables and bringing the magnitude of these variables to same scale, ensures faster convergence.

Verified Answer

(2) A report in PDF format (8 marks) should have the following sections a. A description of how to handle the missing values in your code and report the results. (2.5 Marks) b. A description of the regression technique you used. (3 Marks) c. A description of the results to report. (2.5 Marks)

Verified Answer

2.6 Which of the following statements is/are true? To perform regression using MLPs, non-linear activation functions might be required in the hidden layer but generally, no non-linear activation function is required for the output layer. ☐ If you would want to predict an age of a person, we can use ReLU activation function in the output layer.→→ For a regression problem, in which the output value is always within a range of values, we could use sigmoid or tanh function and scale the values to ensure it is in the bounded range.

Verified Answer

3.3) Which of the following statements is/are true? When applying momentum optimization, setting higher momentum co-efficient will always lead to faster convergence. Unlike in gradient descent, in momentum optimization the gradients at a step is dependent on the previous step. Unlike in stochastic gradient descent, in momentum optimization the path to convergence is faster but with high variance.

Verified Answer

3. Assessment The following questions were given to Chat GPT. Determine what's correct, fix all the mistakes, and assign a letter grade: A,B,C,D,F. Rainbow sequences are a special flavor of infinite sequences. They are defined inductively as follows. i) The infinite sequence of increasing positive odd integers (1,3,5,7,9,11,...) is a rainbow sequence. ii) If (t₁, t2, t3,ti,) is a rainbow sequence, then (t₁, t2, t3,..., ti + 2,...) (the sequence just like the first one, except any single value t, is replaced with the value ti + 2) is a rainbow sequence. iii) If (t₁, t₂, t3,.,t₁) is a rainbow sequence, then (t₁, t2, t3,..., t; — 2,...) (the sequence just like the first one, except any single value ti is replaced with the value ti-2) is a rainbow sequence. a) Prove by structural induction that every rainbow sequence is an infinite se- quence only consisting of odd values. 1. Base Case: The initial rainbow sequence, (1, 3, 5, 7, 9, ...), is an infinite sequence consisting only of odd values. This is true because it consists of positive odd integers. 2. Inductive Step: Assume that a rainbow sequence, (t1, t2, t3,..., ti, ...), consists of only odd values. We need to show that the sequence obtained by applying either rule ii) or rule iii) will also consist of only odd values. • Applying rule ii): The value ti + 2 is obtained by adding 2 to an odd integer ti. Since adding 2 to an odd integer always results in an odd integer, the new sequence will still consist of only odd values. • Applying rule iii): The value ti-2 is obtained by subtracting 2 from an odd integer ti. Again, subtracting 2 from an odd integer always results in an odd integer, so the new sequence will still consist of only odd values. Therefore, based on the base case and the inductive step, we can conclude that every rainbow sequence is an infinite sequence consisting only of odd values./nb) Show that it is not the case that all infinite sequences consisting of only odd values are rainbow sequences. In other words, disprove the converse of part b). c) To disprove the converse of part b) and show that not all infinite sequences consisting only of odd values are rainbow sequences, we can provide a counterexample. Consider the infinite sequence (1, 3, 5, 7, 9, 11, ...). This sequence consists only of odd values, but it does not follow the rules of rainbow sequences. According to the rainbow sequence rules, the values in the sequence must be obtained by either adding 2 or subtracting 2 from the previous values. In the provided sequence, the difference between consecutive terms is always 2. However, in a rainbow sequence, the difference between consecutive terms alternates between 2 and 0. Therefore, the sequence (1, 3, 5, 7, 9, 11, ...) consisting only of odd values is not a rainbow sequence, disproving the converse of part b).

Verified Answer