Encoding Text#
Consider the following examples of movie reviews:
Example 1
Review 1A (negative)
I thought the movie would be great, but it is not. The script is weak, the pacing is slow, and the ending feels pointless.Review 1B (positive)
I thought the movie would not be great, but it is. The script is strong, the pacing is brisk, and the ending feels meaningful.
Example 2
Review 2A (negative)
The acting is decent, but the plot is predictable and the jokes fall flat.Review 2B (positive)
The plot is predictable and the jokes fall flat, but the acting is decent.
Example 3
Review 3A (negative)
For the first ninety minutes I waited for something exciting to happen. Spoiler: it never does.Review 3B (positive)
For the first ninety minutes I waited for something exciting to happen, and when it finally does it is worth every second.
A problem with a bag of words representation treats each review as the same multiset of words: $\( vector=(acting:1, plot:1, predictable:1, but:1, \ldots). \)$
Because position is discarded, the classifier cannot learn that “not” negates what follows or that the clause after “but” usually carries the main sentiment.
import numpy as np
from tensorflow.keras.layers import Embedding
# 1. Toy vocabulary: 5 tokens
# 0 = <PAD>, 1 = "good", 2 = "bad", 3 = "not", 4 = "movie"
vocab_size = 5
embed_dim = 3 # just three numbers per word
sequence_len = 4
embed = Embedding(input_dim=vocab_size,
output_dim=embed_dim,
input_length=sequence_len)
# 2. Example review, integer‑encoded and padded:
# "not good movie" → [3, 1, 4, 0]
sample = np.array([[3, 1, 4, 0]]) # shape (1, 4)
dense_seq = embed(sample) # shape (1, 4, 3)
print(dense_seq.numpy().round(2))
2025-05-08 15:06:44.682838: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 15:06:44.686025: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 15:06:44.694503: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1746716804.708111 74993 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1746716804.712204 74993 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1746716804.723628 74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1746716804.723638 74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1746716804.723640 74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1746716804.723641 74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
2025-05-08 15:06:44.727530: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
[[[ 0.02 0.03 -0.02]
[-0.05 0.04 -0.03]
[ 0.04 0.03 0.05]
[ 0. -0.01 0.03]]]
/opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/layers/core/embedding.py:90: UserWarning: Argument `input_length` is deprecated. Just remove it.
warnings.warn(
2025-05-08 15:06:47.093900: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)
Word Embeddings#
Unlike bag‑of‑words, use dense vectors that are trainable and will gradually move so that “good” and “bad” point in different directions, while “not” can flip the meaning when a sequential layer (CNN, RNN) reads the tokens in order.
import numpy as np
from tensorflow.keras.layers import Embedding
# 1. Toy vocabulary: 5 tokens
# 0 = <PAD>, 1 = "good", 2 = "bad", 3 = "not", 4 = "movie"
vocab_size = 6
embed_dim = 2 # just three numbers per word
sequence_len = 4
embed = Embedding(input_dim=vocab_size,
output_dim=embed_dim,
input_length=sequence_len)
# 2. Example review, integer‑encoded and padded:
# "not good movie" → [3, 1, 4, 0]
sample = np.array([[3, 1, 5, 3]]) # shape (1, 4)
dense_seq = embed(sample) # shape (1, 4, 3)
print(dense_seq.numpy().round(2))
[[[ 0.02 0. ]
[-0.03 0.01]
[ 0.05 0.02]
[ 0.02 0. ]]]
import tensorflow as tf
Using a CNN#
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# ------------------------------------------------------------------
# 1. Data loading and preprocessing
# ------------------------------------------------------------------
max_features = 20_000 # keep the 20 000 most frequent words
maxlen = 400 # cut / pad every review to 400 tokens
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(
num_words=max_features
)
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=maxlen)
# ------------------------------------------------------------------
# 2. CNN model
# ------------------------------------------------------------------
model = keras.Sequential([
layers.Embedding(max_features, 128, input_length=maxlen),
layers.Conv1D(64, 7, activation="relu"),
layers.MaxPooling1D(3),
layers.Conv1D(64, 7, activation="relu"),
layers.GlobalMaxPooling1D(),
layers.Dense(1, activation="sigmoid")
])
model.compile(
optimizer="adam",
loss="binary_crossentropy",
metrics=["accuracy"]
)
model.summary()
# ------------------------------------------------------------------
# 3. Training
# ------------------------------------------------------------------
history = model.fit(
x_train,
y_train,
epochs=8,
batch_size=128,
validation_split=0.2,
verbose=0
)
# ------------------------------------------------------------------
# 4. Evaluation
# ------------------------------------------------------------------
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f"Test accuracy: {test_acc:.3f}")
# ------------------------------------------------------------------
# 5. Predictions (first ten test examples, rounded for readability)
# ------------------------------------------------------------------
preds = model.predict(x_test[:10]).round(3).squeeze()
print("Predicted probabilities:", preds)
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ embedding_2 (Embedding) │ ? │ 0 (unbuilt) │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv1d (Conv1D) │ ? │ 0 (unbuilt) │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling1d (MaxPooling1D) │ ? │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv1d_1 (Conv1D) │ ? │ 0 (unbuilt) │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ global_max_pooling1d │ ? │ 0 │ │ (GlobalMaxPooling1D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ ? │ 0 (unbuilt) │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 0 (0.00 B)
Trainable params: 0 (0.00 B)
Non-trainable params: 0 (0.00 B)
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
Cell In[4], line 44
39 model.summary()
41 # ------------------------------------------------------------------
42 # 3. Training
43 # ------------------------------------------------------------------
---> 44 history = model.fit(
45 x_train,
46 y_train,
47 epochs=8,
48 batch_size=128,
49 validation_split=0.2,
50 verbose=0
51 )
53 # ------------------------------------------------------------------
54 # 4. Evaluation
55 # ------------------------------------------------------------------
56 test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:117, in filter_traceback.<locals>.error_handler(*args, **kwargs)
115 filtered_tb = None
116 try:
--> 117 return fn(*args, **kwargs)
118 except Exception as e:
119 filtered_tb = _process_traceback_frames(e.__traceback__)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:395, in TensorFlowTrainer.fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq)
384 if getattr(self, "_eval_epoch_iterator", None) is None:
385 self._eval_epoch_iterator = TFEpochIterator(
386 x=val_x,
387 y=val_y,
(...) 393 shuffle=False,
394 )
--> 395 val_logs = self.evaluate(
396 x=val_x,
397 y=val_y,
398 sample_weight=val_sample_weight,
399 batch_size=validation_batch_size or batch_size,
400 steps=validation_steps,
401 callbacks=callbacks,
402 return_dict=True,
403 _use_cached_eval_dataset=True,
404 )
405 val_logs = {
406 "val_" + name: val for name, val in val_logs.items()
407 }
408 epoch_logs.update(val_logs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:117, in filter_traceback.<locals>.error_handler(*args, **kwargs)
115 filtered_tb = None
116 try:
--> 117 return fn(*args, **kwargs)
118 except Exception as e:
119 filtered_tb = _process_traceback_frames(e.__traceback__)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:483, in TensorFlowTrainer.evaluate(self, x, y, batch_size, verbose, sample_weight, steps, callbacks, return_dict, **kwargs)
481 for step, iterator in epoch_iterator:
482 callbacks.on_test_batch_begin(step)
--> 483 logs = self.test_function(iterator)
484 callbacks.on_test_batch_end(step, logs)
485 if self.stop_evaluating:
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:219, in TensorFlowTrainer._make_function.<locals>.function(iterator)
215 def function(iterator):
216 if isinstance(
217 iterator, (tf.data.Iterator, tf.distribute.DistributedIterator)
218 ):
--> 219 opt_outputs = multi_step_on_iterator(iterator)
220 if not opt_outputs.has_value():
221 raise StopIteration
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:150, in filter_traceback.<locals>.error_handler(*args, **kwargs)
148 filtered_tb = None
149 try:
--> 150 return fn(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:833, in Function.__call__(self, *args, **kwds)
830 compiler = "xla" if self._jit_compile else "nonXla"
832 with OptionalXlaContext(self._jit_compile):
--> 833 result = self._call(*args, **kwds)
835 new_tracing_count = self.experimental_get_tracing_count()
836 without_tracing = (tracing_count == new_tracing_count)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:889, in Function._call(self, *args, **kwds)
886 try:
887 # This is the first call of __call__, so we have to initialize.
888 initializers = []
--> 889 self._initialize(args, kwds, add_initializers_to=initializers)
890 finally:
891 # At this point we know that the initialization is complete (or less
892 # interestingly an exception was raised) so we no longer need a lock.
893 self._lock.release()
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:696, in Function._initialize(self, args, kwds, add_initializers_to)
691 self._variable_creation_config = self._generate_scoped_tracing_options(
692 variable_capturing_scope,
693 tracing_compilation.ScopeType.VARIABLE_CREATION,
694 )
695 # Force the definition of the function for these arguments
--> 696 self._concrete_variable_creation_fn = tracing_compilation.trace_function(
697 args, kwds, self._variable_creation_config
698 )
700 def invalid_creator_scope(*unused_args, **unused_kwds):
701 """Disables variable creation."""
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:178, in trace_function(args, kwargs, tracing_options)
175 args = tracing_options.input_signature
176 kwargs = {}
--> 178 concrete_function = _maybe_define_function(
179 args, kwargs, tracing_options
180 )
182 if not tracing_options.bind_graph_to_function:
183 concrete_function._garbage_collector.release() # pylint: disable=protected-access
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:283, in _maybe_define_function(args, kwargs, tracing_options)
281 else:
282 target_func_type = lookup_func_type
--> 283 concrete_function = _create_concrete_function(
284 target_func_type, lookup_func_context, func_graph, tracing_options
285 )
287 if tracing_options.function_cache is not None:
288 tracing_options.function_cache.add(
289 concrete_function, current_func_context
290 )
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:310, in _create_concrete_function(function_type, type_context, func_graph, tracing_options)
303 placeholder_bound_args = function_type.placeholder_arguments(
304 placeholder_context
305 )
307 disable_acd = tracing_options.attributes and tracing_options.attributes.get(
308 attributes_lib.DISABLE_ACD, False
309 )
--> 310 traced_func_graph = func_graph_module.func_graph_from_py_func(
311 tracing_options.name,
312 tracing_options.python_function,
313 placeholder_bound_args.args,
314 placeholder_bound_args.kwargs,
315 None,
316 func_graph=func_graph,
317 add_control_dependencies=not disable_acd,
318 arg_names=function_type_utils.to_arg_names(function_type),
319 create_placeholders=False,
320 )
322 transform.apply_func_graph_transforms(traced_func_graph)
324 graph_capture_container = traced_func_graph.function_captures
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py:1060, in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, create_placeholders)
1057 return x
1059 _, original_func = tf_decorator.unwrap(python_func)
-> 1060 func_outputs = python_func(*func_args, **func_kwargs)
1062 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
1063 # TensorArrays and `None`s.
1064 func_outputs = variable_utils.convert_variables_to_tensors(func_outputs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:599, in Function._generate_scoped_tracing_options.<locals>.wrapped_fn(*args, **kwds)
595 with default_graph._variable_creator_scope(scope, priority=50): # pylint: disable=protected-access
596 # __wrapped__ allows AutoGraph to swap in a converted function. We give
597 # the function a weak reference to itself to avoid a reference cycle.
598 with OptionalXlaContext(compile_with_xla):
--> 599 out = weak_wrapped_fn().__wrapped__(*args, **kwds)
600 return out
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py:41, in py_func_from_autograph.<locals>.autograph_handler(*args, **kwargs)
39 """Calls a converted version of original_func."""
40 try:
---> 41 return api.converted_call(
42 original_func,
43 args,
44 kwargs,
45 options=converter.ConversionOptions(
46 recursive=True,
47 optional_features=autograph_options,
48 user_requested=True,
49 ))
50 except Exception as e: # pylint:disable=broad-except
51 if hasattr(e, "ag_error_metadata"):
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:339, in converted_call(f, args, kwargs, caller_fn_scope, options)
337 if is_autograph_artifact(f):
338 logging.log(2, 'Permanently allowed: %s: AutoGraph artifact', f)
--> 339 return _call_unconverted(f, args, kwargs, options)
341 # If this is a partial, unwrap it and redo all the checks.
342 if isinstance(f, functools.partial):
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:459, in _call_unconverted(f, args, kwargs, options, update_cache)
456 return f.__self__.call(args, kwargs)
458 if kwargs is not None:
--> 459 return f(*args, **kwargs)
460 return f(*args)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:643, in do_not_convert.<locals>.wrapper(*args, **kwargs)
641 def wrapper(*args, **kwargs):
642 with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.DISABLED):
--> 643 return func(*args, **kwargs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:132, in TensorFlowTrainer._make_function.<locals>.multi_step_on_iterator(iterator)
128 @tf.autograph.experimental.do_not_convert
129 def multi_step_on_iterator(iterator):
130 if self.steps_per_execution == 1:
131 return tf.experimental.Optional.from_value(
--> 132 one_step_on_data(iterator.get_next())
133 )
135 # the spec is set lazily during the tracing of `tf.while_loop`
136 empty_outputs = tf.experimental.Optional.empty(None)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:150, in filter_traceback.<locals>.error_handler(*args, **kwargs)
148 filtered_tb = None
149 try:
--> 150 return fn(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:833, in Function.__call__(self, *args, **kwds)
830 compiler = "xla" if self._jit_compile else "nonXla"
832 with OptionalXlaContext(self._jit_compile):
--> 833 result = self._call(*args, **kwds)
835 new_tracing_count = self.experimental_get_tracing_count()
836 without_tracing = (tracing_count == new_tracing_count)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:889, in Function._call(self, *args, **kwds)
886 try:
887 # This is the first call of __call__, so we have to initialize.
888 initializers = []
--> 889 self._initialize(args, kwds, add_initializers_to=initializers)
890 finally:
891 # At this point we know that the initialization is complete (or less
892 # interestingly an exception was raised) so we no longer need a lock.
893 self._lock.release()
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:696, in Function._initialize(self, args, kwds, add_initializers_to)
691 self._variable_creation_config = self._generate_scoped_tracing_options(
692 variable_capturing_scope,
693 tracing_compilation.ScopeType.VARIABLE_CREATION,
694 )
695 # Force the definition of the function for these arguments
--> 696 self._concrete_variable_creation_fn = tracing_compilation.trace_function(
697 args, kwds, self._variable_creation_config
698 )
700 def invalid_creator_scope(*unused_args, **unused_kwds):
701 """Disables variable creation."""
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:178, in trace_function(args, kwargs, tracing_options)
175 args = tracing_options.input_signature
176 kwargs = {}
--> 178 concrete_function = _maybe_define_function(
179 args, kwargs, tracing_options
180 )
182 if not tracing_options.bind_graph_to_function:
183 concrete_function._garbage_collector.release() # pylint: disable=protected-access
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:283, in _maybe_define_function(args, kwargs, tracing_options)
281 else:
282 target_func_type = lookup_func_type
--> 283 concrete_function = _create_concrete_function(
284 target_func_type, lookup_func_context, func_graph, tracing_options
285 )
287 if tracing_options.function_cache is not None:
288 tracing_options.function_cache.add(
289 concrete_function, current_func_context
290 )
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:310, in _create_concrete_function(function_type, type_context, func_graph, tracing_options)
303 placeholder_bound_args = function_type.placeholder_arguments(
304 placeholder_context
305 )
307 disable_acd = tracing_options.attributes and tracing_options.attributes.get(
308 attributes_lib.DISABLE_ACD, False
309 )
--> 310 traced_func_graph = func_graph_module.func_graph_from_py_func(
311 tracing_options.name,
312 tracing_options.python_function,
313 placeholder_bound_args.args,
314 placeholder_bound_args.kwargs,
315 None,
316 func_graph=func_graph,
317 add_control_dependencies=not disable_acd,
318 arg_names=function_type_utils.to_arg_names(function_type),
319 create_placeholders=False,
320 )
322 transform.apply_func_graph_transforms(traced_func_graph)
324 graph_capture_container = traced_func_graph.function_captures
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py:1060, in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, create_placeholders)
1057 return x
1059 _, original_func = tf_decorator.unwrap(python_func)
-> 1060 func_outputs = python_func(*func_args, **func_kwargs)
1062 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
1063 # TensorArrays and `None`s.
1064 func_outputs = variable_utils.convert_variables_to_tensors(func_outputs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:599, in Function._generate_scoped_tracing_options.<locals>.wrapped_fn(*args, **kwds)
595 with default_graph._variable_creator_scope(scope, priority=50): # pylint: disable=protected-access
596 # __wrapped__ allows AutoGraph to swap in a converted function. We give
597 # the function a weak reference to itself to avoid a reference cycle.
598 with OptionalXlaContext(compile_with_xla):
--> 599 out = weak_wrapped_fn().__wrapped__(*args, **kwds)
600 return out
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py:41, in py_func_from_autograph.<locals>.autograph_handler(*args, **kwargs)
39 """Calls a converted version of original_func."""
40 try:
---> 41 return api.converted_call(
42 original_func,
43 args,
44 kwargs,
45 options=converter.ConversionOptions(
46 recursive=True,
47 optional_features=autograph_options,
48 user_requested=True,
49 ))
50 except Exception as e: # pylint:disable=broad-except
51 if hasattr(e, "ag_error_metadata"):
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:339, in converted_call(f, args, kwargs, caller_fn_scope, options)
337 if is_autograph_artifact(f):
338 logging.log(2, 'Permanently allowed: %s: AutoGraph artifact', f)
--> 339 return _call_unconverted(f, args, kwargs, options)
341 # If this is a partial, unwrap it and redo all the checks.
342 if isinstance(f, functools.partial):
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:459, in _call_unconverted(f, args, kwargs, options, update_cache)
456 return f.__self__.call(args, kwargs)
458 if kwargs is not None:
--> 459 return f(*args, **kwargs)
460 return f(*args)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:643, in do_not_convert.<locals>.wrapper(*args, **kwargs)
641 def wrapper(*args, **kwargs):
642 with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.DISABLED):
--> 643 return func(*args, **kwargs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:113, in TensorFlowTrainer._make_function.<locals>.one_step_on_data(data)
110 @tf.autograph.experimental.do_not_convert
111 def one_step_on_data(data):
112 """Runs a single training step on a batch of data."""
--> 113 outputs = self.distribute_strategy.run(step_function, args=(data,))
114 outputs = reduce_per_replica(
115 outputs,
116 self.distribute_strategy,
117 reduction="auto",
118 )
119 return outputs
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/distribute/distribute_lib.py:1673, in StrategyBase.run(***failed resolving arguments***)
1668 with self.scope():
1669 # tf.distribute supports Eager functions, so AutoGraph should not be
1670 # applied when the caller is also in Eager mode.
1671 fn = autograph.tf_convert(
1672 fn, autograph_ctx.control_status_ctx(), convert_by_default=False)
-> 1673 return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/distribute/distribute_lib.py:3263, in StrategyExtendedV1.call_for_each_replica(self, fn, args, kwargs)
3261 kwargs = {}
3262 with self._container_strategy().scope():
-> 3263 return self._call_for_each_replica(fn, args, kwargs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/distribute/distribute_lib.py:4061, in _DefaultDistributionExtended._call_for_each_replica(self, fn, args, kwargs)
4059 def _call_for_each_replica(self, fn, args, kwargs):
4060 with ReplicaContext(self._container_strategy(), replica_id_in_sync_group=0):
-> 4061 return fn(*args, **kwargs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:643, in do_not_convert.<locals>.wrapper(*args, **kwargs)
641 def wrapper(*args, **kwargs):
642 with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.DISABLED):
--> 643 return func(*args, **kwargs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:92, in TensorFlowTrainer.test_step(self, data)
90 else:
91 y_pred = self(x)
---> 92 loss = self._compute_loss(
93 x=x, y=y, y_pred=y_pred, sample_weight=sample_weight, training=False
94 )
95 self._loss_tracker.update_state(
96 loss_module.unscale_loss_for_distribution(loss),
97 sample_weight=tf.shape(tree.flatten(x)[0])[0],
98 )
99 return self.compute_metrics(x, y, y_pred, sample_weight=sample_weight)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/trainer.py:383, in Trainer._compute_loss(self, x, y, y_pred, sample_weight, training)
376 """Backwards compatibility wrapper for `compute_loss`.
377
378 This should be used instead `compute_loss` within `train_step` and
379 `test_step` to support overrides of `compute_loss` that may not have
380 the `training` argument, as this argument was added in Keras 3.3.
381 """
382 if self._compute_loss_has_training_arg:
--> 383 return self.compute_loss(
384 x, y, y_pred, sample_weight, training=training
385 )
386 else:
387 return self.compute_loss(x, y, y_pred, sample_weight)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/trainer.py:351, in Trainer.compute_loss(***failed resolving arguments***)
349 losses = []
350 if self._compile_loss is not None:
--> 351 loss = self._compile_loss(y, y_pred, sample_weight)
352 if loss is not None:
353 losses.append(loss)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/compile_utils.py:690, in CompileLoss.__call__(self, y_true, y_pred, sample_weight)
688 def __call__(self, y_true, y_pred, sample_weight=None):
689 with ops.name_scope(self.name):
--> 690 return self.call(y_true, y_pred, sample_weight)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/compile_utils.py:699, in CompileLoss.call(self, y_true, y_pred, sample_weight)
696 self.build(y_true, y_pred)
697 _, loss_fn, loss_weight, _ = self._flat_losses[0]
698 loss_value = ops.cast(
--> 699 loss_fn(y_true, y_pred, sample_weight), dtype=self.dtype
700 )
701 if loss_weight is not None:
702 loss_value = ops.multiply(loss_value, loss_weight)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/losses/loss.py:79, in Loss.__call__(self, y_true, y_pred, sample_weight)
76 else:
77 mask = None
---> 79 return reduce_weighted_values(
80 losses,
81 sample_weight=sample_weight,
82 mask=mask,
83 reduction=self.reduction,
84 dtype=self.dtype,
85 )
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/losses/loss.py:193, in reduce_weighted_values(values, sample_weight, mask, reduction, dtype)
190 values = values * sample_weight
192 # Apply reduction function to the individual weighted losses.
--> 193 loss = reduce_values(values, sample_weight, reduction)
194 return loss
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/losses/loss.py:153, in reduce_values(values, sample_weight, reduction)
151 divisor = ops.cast(ops.sum(sample_weight), loss.dtype)
152 else:
--> 153 divisor = ops.cast(
154 ops.prod(
155 ops.convert_to_tensor(ops.shape(values), dtype="int32")
156 ),
157 loss.dtype,
158 )
159 loss = ops.divide_no_nan(loss, divisor)
160 loss = scale_loss_for_distribution(loss)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/ops/core.py:803, in cast(x, dtype)
801 if any_symbolic_tensors((x,)):
802 return Cast(dtype=dtype)(x)
--> 803 return backend.core.cast(x, dtype)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/core.py:204, in cast(x, dtype)
202 return x
203 else:
--> 204 return tf.cast(x, dtype=dtype)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:150, in filter_traceback.<locals>.error_handler(*args, **kwargs)
148 filtered_tb = None
149 try:
--> 150 return fn(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/dispatch.py:1260, in add_dispatch_support.<locals>.decorator.<locals>.op_dispatch_handler(*args, **kwargs)
1258 # Fallback dispatch system (dispatch v1):
1259 try:
-> 1260 return dispatch_target(*args, **kwargs)
1261 except (TypeError, ValueError):
1262 # Note: convert_to_eager_tensor currently raises a ValueError, not a
1263 # TypeError, when given unexpected types. So we need to catch both.
1264 result = dispatch(op_dispatch_handler, args, kwargs)
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/ops/math_ops.py:1021, in cast(x, dtype, name)
1014 logging.warn(
1015 f"You are casting an input of type {x.dtype.name} to an "
1016 f"incompatible dtype {base_type.name}. This will "
1017 "discard the imaginary part and may not be what you "
1018 "intended."
1019 )
1020 if x.dtype != base_type:
-> 1021 x = gen_math_ops.cast(x, base_type, name=name)
1022 return x
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/ops/gen_math_ops.py:2130, in cast(x, DstT, Truncate, name)
2128 Truncate = False
2129 Truncate = _execute.make_bool(Truncate, "Truncate")
-> 2130 _, _, _op, _outputs = _op_def_library._apply_op_helper(
2131 "Cast", x=x, DstT=DstT, Truncate=Truncate, name=name)
2132 _result = _outputs[:]
2133 if _execute.must_record_gradient():
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/op_def_library.py:755, in _apply_op_helper(op_type_name, name, **keywords)
752 def _apply_op_helper(op_type_name, name=None, **keywords): # pylint: disable=invalid-name
753 """Implementation of apply_op that returns output_structure, op."""
--> 755 op_def, g, producer = _GetOpDef(op_type_name, keywords)
756 name = name if name else op_type_name
758 attrs, attr_protos = {}, {}
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/op_def_library.py:735, in _GetOpDef(op_type_name, keywords)
731 try:
732 # Need to flatten all the arguments into a list.
733 # pylint: disable=protected-access
734 g = ops._get_graph_from_inputs(_Flatten(keywords.values()))
--> 735 producer = g.graph_def_versions.producer
736 # pylint: enable=protected-access
737 except AssertionError as e:
File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/ops.py:2271, in Graph.graph_def_versions(self)
2268 if self._finalized:
2269 raise RuntimeError("Graph is finalized and cannot be modified.")
-> 2271 @property
2272 def graph_def_versions(self) -> versions_pb2.VersionDef:
2273 # pylint: disable=line-too-long
2274 """The GraphDef version information of this graph.
2275
2276 For details on the meaning of each version, see
(...) 2280 A `VersionDef`.
2281 """
2282 return versions_pb2.VersionDef.FromString(self._version_def)
KeyboardInterrupt:
A CNN is great at spotting local phrases like “not good,” but it quickly forgets where it found them; sentiment often hinges on relations far apart in the text, so we need a model that can keep track of information across the whole sequence.
A Conv1D with kernel size \(k=7\) can only look at \(7\) consecutive tokens at a time.
Stacking layers enlarges the receptive field only linearly (first layer sees 7 tokens, two layers see \(\approx 19\), etc.).
Important cues in a review are often dozens of tokens apart (“I hoped it would be great…but it is not.”).
On the IMDB dataset, a CNN typically reaches \(~0.88\) accuracy, identical to the bag‑of‑words MLP, then plateaus. Extra filters or layers add parameters but do not solve the fundamental distance problem.